Количество вхождений символа строке java

How do I count the number of occurrences of a char in a String?

I want to count the occurrences of ‘.’ in an idiomatic way, preferably a one-liner. (Previously I had expressed this constraint as «without a loop», in case you’re wondering why everyone’s trying to answer without using a loop).

Loops were made for a problem like this, write the loop in a common Utility class then call your freshly minted one liner.

Just to point out—I appreciate finding the one-liners, it’s fun and (as a true advantage) often easy to remember, but I’d like to point out that a separate method and a loop is better in just about every way—readability and even performance. Most of the «Elegant» solutions below are not going to perform very well because they involve reforming strings/copying memory, whereas a loop that just scanned the string and counted occurrences would be fast and simple. Not that performance should generally be a factor, but don’t look at the one-line over a loop and assume it will perform better.

48 Answers 48

How about this. It doesn’t use regexp underneath so should be faster than some of the other solutions and won’t use a loop.

int count = line.length() - line.replace(".", "").length(); 

This is the best answer. The reason it is the best is because you don’t have to import another library.

Читайте также:  Python record to json

Ugly code can be minimized by making it a method in your own «StringUtils» class. Then the ugly code is in exactly one spot, and everywhere else is nicely readable.

The loop method is much faster than this. Especially when wanting to count a char instead of a String (since there is no String.replace(char, char) method). On a 15 character string, I get a difference of 6049 ns vs 26,739 ns (averaged over 100runs). Raw numbers are huge difference, but percetage wise. it adds up. Avoid the memory allocations — use a loop!

My ‘idiomatic one-liner’ for this is:

int count = StringUtils.countMatches("a.b.c.d", "."); 

Why write it yourself when it’s already in commons lang?

Spring Framework’s oneliner for this is:

int occurance = StringUtils.countOccurrencesOf("a.b.c.d", "."); 

Guava equivalent: int count = CharMatcher.is(‘.’).countIn(«a.b.c.d»); . As answered by dogbane in a duplicate question.

What’s been expensive, at every company at which I’ve worked, is having lots of poorly-written and poorly-maintained «*Utils» classes. Part of your job is to know what’s available in Apache Commons.

Summarize other answer and what I know all ways to do this using a one-liner:

1) Using Apache Commons

int apache = StringUtils.countMatches(testString, "."); System.out.println("apache ."); System.out.println("spring .", "").length(); System.out.println("replace [^.]", "").length(); System.out.println("replaceAll \\.", "").length(); System.out.println("replaceAll (second case) \\.",-1).length-1; System.out.println("split java8 java8 (second case) " +testString + " ", ".").countTokens()-1; System.out.println("stringTokenizer https://github.com/Vedenin/useful-java-links/blob/master/helloworlds/5.0-other-examples/src/main/java/other_examples/FindCountOfOccurrencesBenchmark.java">github

Perfomance test (using JMH, mode = AverageTime, score 0.010 better then 0.351):

Benchmark Mode Cnt Score Error Units 1. countMatches avgt 5 0.010 ± 0.001 us/op 2. countOccurrencesOf avgt 5 0.010 ± 0.001 us/op 3. stringTokenizer avgt 5 0.028 ± 0.002 us/op 4. java8_1 avgt 5 0.077 ± 0.005 us/op 5. java8_2 avgt 5 0.078 ± 0.003 us/op 6. split avgt 5 0.137 ± 0.009 us/op 7. replaceAll_2 avgt 5 0.302 ± 0.047 us/op 8. replace avgt 5 0.303 ± 0.034 us/op 9. replaceAll_1 avgt 5 0.351 ± 0.045 us/op 

The printed strings do not match the ones above, and the order is fastest first which makes lookup tricky at least. Nice answer otherways!

case 2, generalized for codepoints that need more than one UTF-16 code unit: "1🚲2🚲3 has 2".codePoints().filter((c) -> c == "🚲".codePointAt(0)).count()

Sooner or later, something has to loop. It's far simpler for you to write the (very simple) loop than to use something like split which is much more powerful than you need.

By all means encapsulate the loop in a separate method, e.g.

public static int countOccurrences(String haystack, char needle) < int count = 0; for (int i=0; i < haystack.length(); i++) < if (haystack.charAt(i) == needle) < count++; >> return count; > 

Then you don't need have the loop in your main code - but the loop has to be there somewhere.

(I'm not even sure where the "stack" bit of the comment comes from. It's not like this answer is my recursive one, which is indeed nasty to the stack.)

not only that but this is possibly an anti optimization without taking a look at what the jit does. If you did the above on an array for loop for example you might make things worse.

@sulai: Chris's concern is baseless, IMO, in the face of a trivial JIT optimization. Is there any reason that comment drew your attention at the moment, over three years later? Just interested.

Probably @sulai just came across the question as I did (while wondering if Java had a built-in method for this) and didn't notice the dates. However, I'm curious how moving the length() call outside of the loop could make performance worse, as mentioned by @ShuggyCoUk a few comments up.

I had an idea similar to Mladen, but the opposite.

String s = "a.b.c.d"; int charCount = s.replaceAll("[^.]", "").length(); println(charCount); 

Correct. ReplaceAll(".") would replace any character, not just dot. ReplaceAll("\\.") would have worked. Your solution is more straightforward.

jjnguy had actually suggested a replaceAll("[^.]") first, upon seeing my "a.b.c.d".split("\\.").length-1 solution. But after being hit 5 times, I deleted my answer (and his comment).

". now you have two problems" (oblig.) Anyway, I'd bet that there are tens of loops executing in replaceAll() and length() . Well, if it's not visible, it doesn't exist ;o)

i don't think it's a good idea to use regex and create a new string for the counting. i would just create a static method that loop every character in the string to count the number.

@mingfai: indeed, but the original question is about making a one-liner, and even, without a loop (you can do a loop in one line, but it will be ugly!). Question the question, not the answer. 🙂

String s = "a.b.c.d"; int charCount = s.length() - s.replaceAll("\\.", "").length(); 

ReplaceAll(".") would replace all characters.

PhiLho's solution uses ReplaceAll("[^.]",""), which does not need to be escaped, since [.] represents the character 'dot', not 'any character'.

String s = "a.b.c.d"; long result = s.chars().filter(ch -> ch == '.').count(); 

My 'idiomatic one-liner' solution:

int count = "a.b.c.d".length() - "a.b.c.d".replace(".", "").length(); 

Have no idea why a solution that uses StringUtils is accepted.

This creates an extra string just to produce a count. No idea why anyone would prefer this over StringUtils if StringUtils is an option. If it's not an option, they should just create a simple for loop in a utility class.

String text = "a.b.c.d"; int count = text.split("\\.",-1).length-1; 

This one seems to have a relatively large overhead, be warned that it may create a lot of small strings. Normally that does not matter much but use with care.

here is a solution without a loop:

public static int countOccurrences(String haystack, char needle, int i) < return ((i=haystack.indexOf(needle, i)) == -1)?0:1+countOccurrences(haystack, needle, i+1);>System.out.println("num of dots is "+countOccurrences("a.b.c.d",'.',0)); 

well, there is a loop, but it is invisible 🙂

The problem sounds contrived enough to be homework, and if so, this recursion is probably the answer you're being asked to find.

That uses indexOf, which will loop. but a nice idea. Posting a truly "just recursive" solution in a minute.

If it has more occurrences that your available stack slots, you will have a stack overflow exception 😉

I don't like the idea of allocating a new string for this purpose. And as the string already has a char array in the back where it stores it's value, String.charAt() is practically free.

does the trick, without additional allocations that need collection, in 1 line or less, with only J2SE.

Giving some love for this one because it is the only one doing a single pass over the string. I DO care about performance .

charAt iterates through 16 bit code points not characters! A char in Java is not a character. So this answer implies that there must be no Unicode symbol with a high surrogate being equal to the code point of delim . I am not sure if it is correct for the dot, but in general it might be not correct.

Okay, inspired by Yonatan's solution, here's one which is purely recursive - the only library methods used are length() and charAt() , neither of which do any looping:

public static int countOccurrences(String haystack, char needle) < return countOccurrences(haystack, needle, 0); >private static int countOccurrences(String haystack, char needle, int index) < if (index >= haystack.length()) < return 0; >int contribution = haystack.charAt(index) == needle ? 1 : 0; return contribution + countOccurrences(haystack, needle, index+1); > 

Whether recursion counts as looping depends on which exact definition you use, but it's probably as close as you'll get.

I don't know whether most JVMs do tail-recursion these days. if not you'll get the eponymous stack overflow for suitably long strings, of course.

No, tail recursion will probably be in Java 7, but it's not widespread yet. This simple, direct tail recursion could be translated to a loop at compile time, but the Java 7 stuff is actually built-in to the JVM to handle chaining through different methods.

You'd be more likely to get tail recursion if your method returned a call to itself (including a running total parameter), rather than returning the result of performing an addition.

Not sure about the efficiency of this, but it's the shortest code I could write without bringing in 3rd party libs:

public static int numberOf(String target, String content)

To also count occurences at the end of the string you will have to call split with a negative limit argument like this: return (content.split(target, -1).length - 1); . By default occurences at the end of the string are omitted in the Array resulting from split(). See the Doku

Inspired by Jon Skeet, a non-loop version that wont blow your stack. Also useful starting point if you want to use the fork-join framework.

public static int countOccurrences(CharSequeunce haystack, char needle) < return countOccurrences(haystack, needle, 0, haystack.length); >// Alternatively String.substring/subsequence use to be relatively efficient // on most Java library implementations, but isn't any more [2013]. private static int countOccurrences( CharSequence haystack, char needle, int start, int end ) < if (start == end) < return 0; >else if (start+1 == end) < return haystack.charAt(start) == needle ? 1 : 0; >else < int mid = (end+start)>>>1; // Watch for integer overflow. return countOccurrences(haystack, needle, start, mid) + countOccurrences(haystack, needle, mid, end); > > 

(Disclaimer: Not tested, not compiled, not sensible.)

Perhaps the best (single-threaded, no surrogate-pair support) way to write it:

public static int countOccurrences(String haystack, char needle) < int count = 0; for (char c : haystack.toCharArray()) < if (c == needle) < ++count; >> return count; > 

Источник

Подсчет вхождений данного символа в строку в Java

В этом посте будет обсуждаться, как подсчитывать вхождения заданного символа в строку в Java.

1. Наивное решение

Мы также можем написать собственную процедуру для этой простой задачи. Идея состоит в том, чтобы перебирать символы в строке используя цикл for, и для каждого встречающегося символа увеличивайте счетчик (начиная с 0), если он совпадает с данным символом.

2. Использование Java 8

В Java 8 мы можем использовать Stream для подсчета вхождений заданного символа в строку. Это показано ниже:

3. Использование библиотеки Guava

Еще одна хорошая альтернатива - использовать Guava's CharMatcher учебный класс.

4. Использование Apache Commons Lang

Мы также можем добиться этого с помощью countMatches метод из StringUtils класс, предоставляемый библиотекой Apache Commons.

return org . apache . commons . lang3 . StringUtils . countMatches ( str , String . valueOf ( ch ) ) ;

5. Использование replace() метод

Вот еще одно решение, использующее String replace() метод для удаления всех вхождений указанного символа из строки и использования length() свойство строки для определения количества, как показано ниже:

6. Использование регулярных выражений

Другой вероятный способ — использование регулярных выражений вместе со счетчиком.

7. Использование карты частот

Временная сложность всех вышеперечисленных решений как минимум линейна, поскольку мы сканируем всю строку. Если общее количество поисков больше, рассмотрите возможность предварительной обработки строки один раз и создать карту частот из него, в котором хранится количество каждого отдельного символа, присутствующего в строке. Теперь каждый последующий вызов метода будет занимать единственную константную строку.

Источник

Оцените статью