Sorting a numeric list using shortcuts and regular expressions

I was working on a shortcut that required sorting of a comma separated list of numbers retrieved from the web.

This led me to discover a useful sorting feature in shortcuts and (inspired by episode 34) to use a number of regular expressions.

I thought I’d share the results here as others may find these useful.

The list in question is of the form…
11,2,1,45,7,21

The first shortcut below simply takes this list, splits it up, calls a second shortcut to do the sorting and displays the result.

The trickier piece was doing the actual sorting.

I discovered that the Filter Files action in Shortcuts can be used to sort a list of text strings. Unfortunately if you just apply this to a list of numbers you get something like 1, 11, 2, 21, 45, 7.

To get round this I padded the numbers with leading zeroes to a fixed length (e.g. 011, 002, 001, 045, 007, 021) which generates the correct order when sorted and then removed the leading zeroes again to make the output more readable.

This was achieved using three separate Replace Text actions, each employing regular expressions to do the following…

  • Pad the number with an arbitrary number of leading zeroes
  • Trim the number to a fixed character length
  • Remove the leading zeroes after sorting

This can be seen in the Shortcut below.

Note that I’ve expanded the first Replace Text action to show the Regular Expression switch, which needs to be toggled on for all 3 instances.

Also note that Shortcut currently handles numbers with up to 12 digits. If you wanted to sort longer numbers then you just need to add more zeroes to the text string in the first action.

2 Likes

The sorting method works for this data set, but you need to tweak the first regular expression in the sort shortcut to ignore the leading spaces in front of the numbers.

This would need to be something like ^(\s+)?(\d+)
and you’d need to change $0 to $2 in the replace string

Incidentally I now have an enhanced version of the sort shortcut that handles negative numbers and decimals as well as positive integers. I’ve also just enhanced this to ignore leading white space based on this comment.

Happy to share this if anyone is interested.

The first replace parameter in the last action of the original screenshot is ^0*, the ^ specifies that it is a match at the start of the string.

Oh yeah, missed that. As a side bar, one doesn’t have to use zeroes to pad; any character can be used, as it’s simply about displacing the position of the leading significant digit relative to the start of the string.

I thought your method (discovery) was really quite clever, using the File Filter to perform the sort. My initial method for sorting numbers used a purely RegEx approach, iterating through a list, retrieving the minimum value, then scrubbing it from the original list (RegEx), and adding it to a new, ordered list. It was very slow, though.

Adapting your technique, I decided that, instead of padding the number strings, it might be simpler to divide them all by a constant. If the constant is one that shifts the decimal place to the left a fixed number of places, sufficient to ensure all values have a magnitude less than 10, then that achieves the same effect of displacing the leading significant digit so that a lexicographical sort operates equivalently to a numerical sort. The constant is derived using the list’s maximum value, whose logarithm tells you how many decimal places to left-shift for the sort to work. Since the approach is, itself, a numerical approach rather than a string-based one, it will naturally work with any number expressed in decimal notation (negative values are sorted reverse to the order of positive values, but this is easily dealt with by grabbing the negative values and reversing their order).

Screenshot


image Sort Numerical List.shortcut (3KB)

Just for info, the file sorting trick has been around since Workflow days. I know Ari, one of the original Workflow team, recommended it as a workaround on Twitter and Reddit in the past. It’s still as relevant today as it was then.

Just to throw another numeric only sort option into the mix:

The example takes a numeric, comma separated list of

22,1,111,-12,-32, -31.9, -31.8,0,0.2, 0.13

and returns

-32,-31.9,-31.8,-12,0,0.13,0.2,1,22,111

It deals with negatives, decimals, and additional whitespace between commas.

1 Like

Thus showing there are often multiple valid ways to solve the same problem :slightly_smiling_face:.

I went down the string approach mainly because it gave me an opportunity to play with regular expressions, but agree the mathematical approach works just as well and may be better for some data sets.

I’ve posted my enhanced Shortcut below for comparison. It’s slightly shorter than yours, but that’s mainly due to the changes in iOS 13 removing the need for so many Set and Get Variables.

Interestingly I’d adopted exactly the same approach that you did to handling negative numbers by having two separate Filter Files actions.

Just one thing to note is that while you don’t have to use zeroes to pad in my approach, you can’t just use any character. It has to sort before “1” or it will mess up the order. So you can’t use “A” for instance.

Very good point, thanks for catching that.