RegEx - restrict to 1st line

I need to isolate the first number on the first line of some text. I have tried the following:
([0-9])(,.) and replacing the text with $1. This seems to work on regex101 tester but not in Shortcuts Replace text action. How can I end up with the first number 23252.

Would this suffice?

Thanks but when I use that in the replace text action it does not work. What am I missing?

I thought you wanted to get the first number? A match operation is sufficient for that.

What are you trying to achieve with your replace approach?

I mean you can do things like the below, but I’m trying to figure out what you are trying to do to offer tlan appropriate approach and regex.

Just leave that number on the first line.

Replace everything but the first number on the first line : equivalent to the match approach but more complex regex.

Sorry for being unclear. I just want to end up with the number 23252. My regular expression works fine in the tester but it does not produce the same results in the replace text action.

You have a multiline string, and you want to act on a single line. So as your first step, split the string into its constituent lines, and then act on only the first one:

Okay, so that’s exactly what the initial match I suggested using regex does. There’s no need to do any substitutions, just the match.

If there was some particular reason, the second example in my second reply shows how to do that.

Two things to note about regular expression testers (regex101 is my choice of preference too).

  1. There are different flavours of regular expression. Apple/Shortcuts uses ICU, which isn’t available on regex101, but you can still use it and it is just something you need to be aware of.
  2. You can specify particular engine options - these are found at the end of the regular expression field in regex101 and default to global and multiline, along with the flavour of rgeex being PCRE.

If I take your first example with the default settings in regex101, I don’t get the match you describe. It subs in the first entry on every line, not just everything for the first entry on the first line. I tried changing the options, but I didn’t get the result you seem to have got where it just leaves you with that first number.

In your second post you use a regex of ^([0-9])* and replace with $1. My expectation was that this would match the first number, and then replace that with the first number. But your result was quite surprising in that it seemed to reinsert only the first digit. I tried reproducing this in Shortcuts myself, but I got the result I’d expected, not the one you got. So either there’s a non-obvious difference in that first number we’ve both used, or perhaps there’s a bug in your version of Shortcuts. Not running a public or developer beta by any chance are you?

Why split the content. Newlines are just characters? You can account for them in the regex and save the need to have all of the extra steps. You need to do the regex anyway, so why not just do it in one step?

The solution with (.|\n)* works perfectly, of course.

My approach is probably due to a habit of dealing with interfaces where it’s simpler to read data one line at a time. If I were trying to do this on the Mac I’d do

head -1 file.csv | awk -F',' '{print $1}'

even though there’s a way to get awk itself to only look at the first record.

I’ve looked at every regex and there is no bug. Everything works as it should. Let me explain:

@sylumer used ^([0-9])* in a Match action, which returns the whole match and therefore the whole number (here). Note that the asterisk is outside of the group. If there were a Get Group of Matched Text action afterwards, it would return the last capture of the group which would be in this case 2.

@ihf used ^([0-9])* in a Replace action and replaces it with the first group (here). Since the asterisk is outside of the group, the group only captures the last occurrence and replaces the whole match with it, in this case with 2 (the last digit of the number)

@sylumer then uses a Replace action, but with the regex ^([0-9]*). Notice that the asterisk is now inside the group. Therefore the group captures the whole number and the Replace action inserts the whole number.

To add to the confusion, @sylumer tries the regex with the asterisk outside of the group (see quote above), but ends up with the regex ^(0-9])*. Notice the missing [ in front of the 0, which renders the whole regex as invalid and the Replace action silently does nothing.

I hope that it’s now clear where the different results come from.

And to the initial problem from @ihf: I can only repeat, what @sylumer already said. If you simply want the very first number and nothing else, go with the Match action. Generally speaking, if you want to simply extract some information out of some text, it is best to use the Match action. On the other hand, if you want to modify a portion of the text, then the Replace action is the best choice.

1 Like

:smile: I was obviously having one of those code blindness moments!

Be careful with all the asterisks :slight_smile: Also, since you just want one, I wouldn’t use replace.

How about this?