Kotlin Collection VS Kotlin Sequence VS Java Stream
Although the functional API in Kotlin Collection is similar to the API in Java 8 Stream. But Kotlin’s collection is not the same as Java’s collection.
Kolin’s collections are divided into mutable collections and immutable collections. Immutable collections are List, Set, and Map, which are read-only types and cannot be modified by the collection. Mutable collections are MutableList, MutableSet, and MutableMap. These are read and write types that can modify a collection.
The functional apis in the Kotlin collection are similar to most functional apis that support the Lambda language. The following uses only filter, map, and flatMap functions as examples to demonstrate higher-order functions using sets.
1.1 Use of Filter
Filter the number greater than 10 in the set and print it out.
listOf(5.12.8.33) // Create a list collection .filter < it 10 > .forEach(::println) Copy the code
:: Println is a Method Reference, which is a simplified Lambda expression.
The above code is equivalent to the following code:
listOf(5.12.8.33) .filter < it 10 > .forEach < println(it) >Copy the code
1.2 Map Usage
Convert all the strings in the collection to uppercase and print them out.
listOf("java"."kotlin"."scala"."groovy") .map < it.toUpperCase() >.forEach(::println) Copy the code
JAVA KOTLIN SCALA GROOVY Copy the code
1.3 Use of flatMap
Iterate through all the elements, create a collection for each, and finally put all the collections into one collection.
val newList = listOf(5.12.8.33) .flatMap < listOf(it, it + 1) > println(newList) Copy the code
[5, 6, 12, 13, 8, 9, 33, 34] Copy the code
The second Sequence.
Sequence is another container type provided by the Kotlin library. Sequences and collections have the same function API, but are implemented differently.
In fact, Kotlin’s Sequence is more like a Java 8 Stream in that both are deferred. Kotlin’s collection can be converted to a Sequence simply by using the asSequence() method.
listOf(5.12.8.33) .asSequence() .filter < it 10 > .forEach(::println) Copy the code
Or use sequenceOf() directly to create a new Sequence:
sequenceOf(5.12.8.33) / / create the sequence .filter < it10 > .forEach (::println) Copy the code
As stated in Kotlin’s release note 1.2.70:
Using Sequence helps avoid unnecessary AD hoc allocation overhead and can significantly improve the performance of complex processing PipeLines.
Write an example to verify this statement:
@BenchmarkMode(Mode.Throughput) // Benchmark mode, using the overall throughput mode @Warmup(iterations = 3) // Preheat times @Measurement(iterations = 10, time = 5, timeUnit = TimeUnit.SECONDS) Iterations = 10 indicates 10 rounds of testing @Threads(8) // Number of test threads per process @Fork(2) // The number of times the JMH forks two processes to test @OutputTimeUnit(TimeUnit.MILLISECONDS) // The time type of the benchmark results open class SequenceBenchmark < @Benchmark fun testSequence(a):Int < return sequenceOf(1.2.3.4.5.6.7.8.9.10) .map< it * 2 > .filter < it % 3= =0 > .map< it+1 > .sum() > @Benchmark fun testList(a):Int < return listOf(1.2.3.4.5.6.7.8.9.10) .map< it * 2 > .filter < it % 3= =0 > .map< it+1 > .sum() > > fun main(a) < val options = OptionsBuilder() .include(SequenceBenchmark::class.java.simpleName) .output("benchmark_sequence.log") .build() Runner(options).run() > Copy the code
The benchmark results are as follows:
# Run complete. Total time: 00:05:23 REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial experiments, perform baseline and negative tests that provide experimental control, make sure the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts. Do not assume the numbers tell you what you want them to tell. Benchmark Mode Cnt Score Error Units SequenceBenchmark. TestList THRPT 20 + / - 305.825 15924.272 ops/ms SequenceBenchmark testSequence THRPT 20 23099.938 ± 515.524 OPS /msCopy the code
The examples above were tested using JMH, a benchmark tool provided by the OpenJDK, which can benchmark at the method level. The results of the above example show that Sequence is more efficient than List when making multiple chain calls.
This is because the collection returns a new collection as it processes each step, and Sequence does not create a collection in each processing step. For a large amount of data, select Sequence.
Sequence VS Stream
Sequence and Stream both use lazy evaluation.
In programming language theory, Lazy Evaluation, also known as call-by-need, is a concept used in computer programming to minimize the amount of work a computer has to do. It has two related but distinct implications, which can be expressed as «delayed evaluation» and «minimal evaluation.» In addition to the performance gains, the most important benefit of lazy computing is that it can construct an infinite data type.
Here are some differences between Sequence and Stream:
Feature comparison | Sequence | Stream |
---|---|---|
autoboxing | Automatic boxing occurs | Automatic boxing can be avoided for primitive types |
parallelism | Does not support | support |
cross-platform | Supports Kotlin/JVM, Kotlin/JS, Kotlin/Native and other platforms | Can only be used on Kotlin/JVM platforms and JVM versions require =8 |
Ease of use | More concise and supports more functions | Terminal operations with Collectors make the Stream more verbose. |
performance | Most terminal operators are inline functions | In cases where the value may not exist, Sequence supports nullable types, and Stream creates an Optional wrapper. So there will be one more step of object creation. |
In terms of ease of use and performance, I prefer Sequence when given a choice between Sequence and Stream.
Kotlin Sequences vs Java Streams
The above sample code snippet to demonstrates several operations in java streams which are sequential in nature. However, since we have multiprocessing, java changed the game by introduction of parallel streams .
Java Parallel Streams
Java streams huge advantage over Kotlin sequences is that streams can be run in parallel . Sequences do not run in parallel — after all, they are sequential by definition!
Running your operation chains in parallel comes with a startup cost, but once you overcome that cost, it’s generally hard to beat their performance in many cases.
Parallel streams enable us to execute code in parallel on separate cores. The final result is the combination of each individual outcome.So if you’re on the JVM and you’re processing lots of data with an operation chain that’s amenable to running in parallel, it’d be worth considering Java’s parallel streams.
Assume the example below to demonstrate it. To run java streams in parallel, you just need to add parallel() or using parallelStream() on a Collection before performing list of operations
IntStream rangeofInts = IntStream.rangeClosed(1, 10); rangeofInts.parallel().forEach(System.out::println);
List returnList = Collections.singletonList(finalList.parallelStream() .filter(item -> !item.toString().isEmpty()) .flatMap(item -> item.toString()) .sorted().map(itemList -> itemList.toString().toUpperCase()));
Performance Implications
Parallelism can bring performance benefits in certain use cases. But parallel streams cannot be considered as a magical performance booster. So, sequential streams should still be used as default during development.
In conclusion,although sequences can perform much better than collections, there are still plenty of cases where collections get the edge.
Java Streams vs. Kotlin Sequences
Since Java streams are available to be used within Kotlin code, a common question among developers that use Kotlin for backend development is whether to use streams or sequences. I also included some surprises that affect the way we work with sequences and structure our code.
This article analyzes both options from 3 perspectives to determine their strengths and weaknesses:
Null Safety
Using Java streams within Kotlin code results in platform types when using non-primitive values. For example, the following evaluates to List instead of List so it becomes less strongly typed:
people.stream()
.filter < it.age > 18 >
.toList() // evaluates to List
When dealing with values that might not be present, sequences return nullable types whereas streams wrap the result in an Optional :
// Sequence
val nameOfAdultWithLongName = people.asSequence()
.
.find < it.name.length > 5 >
?.name// Stream
val nameOfAdultWithLongName = people.stream()
.
.filter < it.name.length > 5 >
.findAny()
.get() // unsafe unwrapping of Optional
.name
Although we could use orElse(null) in the example above, the compiler doesn’t force us to use Optional wrappers correctly. Even if we did use orElse(null) , the resulting value would be a platform type so the compiler doesn’t enforce safe usage. This pattern has bitten us several times as a runtime exception will be thrown with the stream version if no person is found. Sequences, however, use Kotlin nullable types so safe usage is enforced at compile time.
Therefore sequences are safer from a null-safety perspective.
Readability & Simplicity
Using collectors for terminal operations makes streams more verbose:
// Sequence
val adultsByGender = people.asSequence()
.filter < it.age >= 18…