ChatGPT解决这个技术问题 Extra ChatGPT

Create array of regex matches

In Java, I am trying to return all regex matches to an array but it seems that you can only check whether the pattern matches something or not (boolean).

How can I use a regex match to form an array of all string matching a regex expression in a given string?

Good question. The information you seek should be part of the Java docs on Regex and Matcher. Sadly, it isn't.
A real shame. This functionality seems to exist out of the box in nearly every other language (that has regular expression support).

4
4castle

(4castle's answer is better than the below if you can assume Java >= 9)

You need to create a matcher and use that to iteratively find matches.

 import java.util.regex.Matcher;
 import java.util.regex.Pattern;

 ...

 List<String> allMatches = new ArrayList<String>();
 Matcher m = Pattern.compile("your regular expression here")
     .matcher(yourStringHere);
 while (m.find()) {
   allMatches.add(m.group());
 }

After this, allMatches contains the matches, and you can use allMatches.toArray(new String[0]) to get an array if you really need one.

You can also use MatchResult to write helper functions to loop over matches since Matcher.toMatchResult() returns a snapshot of the current group state.

For example you can write a lazy iterator to let you do

for (MatchResult match : allMatches(pattern, input)) {
  // Use match, and maybe break without doing the work to find all possible matches.
}

by doing something like this:

public static Iterable<MatchResult> allMatches(
      final Pattern p, final CharSequence input) {
  return new Iterable<MatchResult>() {
    public Iterator<MatchResult> iterator() {
      return new Iterator<MatchResult>() {
        // Use a matcher internally.
        final Matcher matcher = p.matcher(input);
        // Keep a match around that supports any interleaving of hasNext/next calls.
        MatchResult pending;

        public boolean hasNext() {
          // Lazily fill pending, and avoid calling find() multiple times if the
          // clients call hasNext() repeatedly before sampling via next().
          if (pending == null && matcher.find()) {
            pending = matcher.toMatchResult();
          }
          return pending != null;
        }

        public MatchResult next() {
          // Fill pending if necessary (as when clients call next() without
          // checking hasNext()), throw if not possible.
          if (!hasNext()) { throw new NoSuchElementException(); }
          // Consume pending so next call to hasNext() does a find().
          MatchResult next = pending;
          pending = null;
          return next;
        }

        /** Required to satisfy the interface, but unsupported. */
        public void remove() { throw new UnsupportedOperationException(); }
      };
    }
  };
}

With this,

for (MatchResult match : allMatches(Pattern.compile("[abc]"), "abracadabra")) {
  System.out.println(match.group() + " at " + match.start());
}

yields

a at 0 b at 1 a at 3 c at 4 a at 5 a at 7 b at 8 a at 10


I wouldn't suggest using an ArrayList here since you don't know upfront the size and might want to avoid the buffer resizing. Instead, I would prefer a LinkedList -- though it's just a suggestion and doesn't make your answer less valid whatsoever.
@Liv, take the time to benchmark both ArrayList and LinkedList, the results may be surprising.
I hear what you're saying and I am aware of the execution speed and memory footprint in both cases;the problem with the ArrayList is that the default constructor creates a capacity of 10 -- if you go past that size with calls to add() you will have to bear with the memory allocation and array copy -- and that might happen a few times. Granted, if you expect just a few matches then your approach is the more efficient one; if however you find that the array "resizing" happens more than once I would suggest a LinkedList, even more so if you're dealing with a low latency app.
@Liv, If your pattern tends to produce matches with a fairly predictable size, and depending on whether the pattern matches sparsely or densely (based on the the sum of the lengths of allMatches vs yourStringHere.length()), you can probably precompute a good size for allMatches. In my experience, the cost of LinkedList memory and iteration efficiency-wise is not usually worth it so LinkedList is not my default posture. But when optimizing a hot-spot, it is definitely worth swapping list implementations to see if you get an improvement.
In Java 9, you can now use Matcher#results to get a Stream which you can use to generate an array (see my answer).
4
4castle

In Java 9, you can now use Matcher#results() to get a Stream<MatchResult> which you can use to get a list/array of matches.

import java.util.regex.Pattern;
import java.util.regex.MatchResult;
String[] matches = Pattern.compile("your regex here")
                          .matcher("string to search from here")
                          .results()
                          .map(MatchResult::group)
                          .toArray(String[]::new);
                    // or .collect(Collectors.toList())

their is no results() method , please run this first
@Bravo Are you using Java 9? It does exist. I linked to the documentation.
:(( is there any alternative for java 8
z
zb226

Java makes regex too complicated and it does not follow the perl-style. Take a look at MentaRegex to see how you can accomplish that in a single line of Java code:

String[] matches = match("aa11bb22", "/(\\d+)/g" ); // => ["11", "22"]

Is the MentaRegex site down? When I visit mentaregex.soliveirajr.com it only says "hi"
@user64141 looks like it is
user64141 it's down now but it's available on the Internet archive web.archive.org/web/20130317004214/http://…
Replaced the link with one to the more common mvnrepository.com...
w
walkeros

Here's a simple example:

Pattern pattern = Pattern.compile(regexPattern);
List<String> list = new ArrayList<String>();
Matcher m = pattern.matcher(input);
while (m.find()) {
    list.add(m.group());
}

(if you have more capturing groups, you can refer to them by their index as an argument of the group method. If you need an array, then use list.toArray())


pattern.matches(input) does not work. You have to pass your regex pattern (again!) --> WTF Java?! pattern.matches(String regex, String input); Do you mean pattern.matcher(input)?
@ElMac Pattern.matches() is a static method, you shouldn't call it on a Pattern instance. Pattern.matches(regex, input) is simply a shorthand for Pattern.compile(regex).matcher(input).matches().
A
Anthony Accioly

From the Official Regex Java Trails:

        Pattern pattern = 
        Pattern.compile(console.readLine("%nEnter your regex: "));

        Matcher matcher = 
        pattern.matcher(console.readLine("Enter input string to search: "));

        boolean found = false;
        while (matcher.find()) {
            console.format("I found the text \"%s\" starting at " +
               "index %d and ending at index %d.%n",
                matcher.group(), matcher.start(), matcher.end());
            found = true;
        }

Use find and insert the resulting group at your array / List / whatever.


N
Nikhil Kumar K
        Set<String> keyList = new HashSet();
        Pattern regex = Pattern.compile("#\\{(.*?)\\}");
        Matcher matcher = regex.matcher("Content goes here");
        while(matcher.find()) {
            keyList.add(matcher.group(1)); 
        }
        return keyList;