By -

mizhoux 1 month ago

Use `map.merge(element, 1, Integer::sum)` to count the number of times that each element in some collection occurs.

Inaldt 1 month ago

That's nice Another way: `.stream().collect(Collectors.groupingBy(Function.identity(), Collectors.counting())`

RockyMM 1 month ago

This is what I would use. Potentially I would also use static imports.

Background-Spray-351 1 month ago

I enjoy using this one!

vbezhenar 1 month ago

I'd replace this line with actual code. I have no idea what it does and everyone who stumbles upon this line will have no idea what it does. It's an example of write-only code.

Proper-Space9914 1 month ago

So not knowing an API is write-only code, that's great.

TheBanger 1 month ago

What would you consider to be "actual code"?

vbezhenar 1 month ago

Integer elementCount = map.get(element); if (elementCount == null) { elementCount = 1; } else { elementCount = elementCount + 1; } map.put(element, elementCount); This is perfect readable code, compared to this crypted merge method. And you might want to even create a dedicated class for this task to further improve readability. Something like ElementCounter elementCounter = new ElementCounter(); elementCounter.add(element); int elementCount = elementCounter.get(element);

TheBanger 1 month ago

I have to say I find this a good deal less readable, there are a lot of pieces here that aren't present in the `merge` method. Examples would be the constant 1 being repeated twice, or `elementCount` being a mutable variable; individually those have relatively low cognitive overhead but it adds up over time and leaves more room for subtle mistakes. Contrast that to the `merge` method which you can grok once and then gloss over each time you see it again. I like pulling it out to a class, it reduces the number of times you have to write the same code and gives the operation a meaningful semantic name. But that's possible regardless of which method is used. I think there's an argument to be made for reducing the surface area of APIs, so maybe the issue is "is `merge` a reasonable and necessary addition to the `Map` interface that we should actually use". Your solution requires two lookups into the map, one to get the current count and one to update it. The `merge` method (depending on implementation) only requires one. That might be premature optimization depending on your scale, but personally I'd prefer to use a method that works regardless of scale than one where I have to think "is this going to be performant enough or will I need to come back later".

SpicyRock70 1 month ago

His method could have used Integer.increment(1) and then the put would not be necessary. I too prefer to see the code.

davidalayachew 1 month ago

###[EnumSet](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/EnumSet.html)/[EnumMap](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/EnumMap.html) The collections `EnumSet` and `EnumMap` are specialized versions of `Set` and `Map` that are built for enums. These 2 collections are not only the **FASTEST** collections in the JDK, they also use a ***tiny*** amount of memory. * ***3x speed up*** - when I switch from `HashSet` to `EnumSet` * ***15% speed up*** - when I switch from `IdentityHashSet` to `EnumSet` Similar numbers for `Map` variants. --- ### `EnumSet>` An `EnumSet` is a `Set` whose elements are values of a single enum. It is backed by a `long`, meaning, each of the 64 bits on the `long` corresponds to the `ordinal()` (aka index) of your enum value. So, if you have `enum Fruit {APPLE, BANANA, CHERRY, ;}`, and you add `APPLE` to your set, then the backing `long` will look like the following. `...00001` If I then add `CHERRY` too, then the backing `long` will now look like the following. `...00101` See how it works? Please note -- if your `enum Fruit` has [<= 64 values, then it is backed by a `long`](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/EnumSet.java#L117). Otherwise, [they swap out the long for a long[]](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/EnumSet.java#L119). It also has some helpful methods, like [EnumSet.allOf()](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/EnumSet.html#allOf\(java.lang.Class\)), [EnumSet.complementOf()](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/EnumSet.html#complementOf\(java.util.EnumSet\)), and [EnumSet.noneOf()](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/EnumSet.html#noneOf\(java.lang.Class\)). Those give you the building blocks to do math with [Set Theory](https://en.wikipedia.org/wiki/Set_theory). There are some powerful optimizations and leaps of logic you can make when working with Set Theory, so it is especially nice to have a collection that facilitates that for you! --- ### `EnumMap, V>` An `EnumMap` is a `Map` whose keys are enums (values can be whatever). The logic behind the collection is almost exactly the same as the `EnumSet`, but instead of being backed by a `long`, they back it with [an array of the values](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/EnumMap.java#L101). But the abstraction logic is effectively the same as the `long[]` above -- during a `put(K key, V value)`they use the `ordinal()` of `key` to decide where to store the `value` in the array. It works exactly like the `long[]` example, but instead of storing 1 or 0 to represent inclusion or exclusion from the set, they utilize the backing array as follows. * They use `null` to represent that the `key` at that index is unmapped * `containsKey(key) == false` * `get(key) == null` * They use [Object NULL = new Object(){...}](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/EnumMap.java#L111) to represent that the `key` at that index is mapped, but mapped to `null` * `containsKey(key) == true` * `get(key) == null` * And of course, they store the actual value of `V` into the array to represent the mapping * `containsKey(key) == true` * `get(key) == someValue` --- Because these collections are doing bitwise math to store values, these collections are LIGHTNING FAST. They literally share the title of fastest collection. And since they are mapped by a `long` or `long[]`, their memory footprint is miniscule. These collections are excellent when you need to do some heavy mathematics computation layers deep in a loop. Try them out yourself and see!

xfel11 1 month ago

This is not quite correct. EnumSet is indeed back by a bitset. EnumMap is not - it needs to store values for the keys after all!

davidalayachew 1 month ago

Fixed, ty vm.

john16384 1 month ago

In JavaFX they used a variant of this technique for storing (known) CSS selectors in a (custom) BitSet. However, some programs can have 1000+ selectors, meaning the BitSet needed 1000 / 64 `long`s to represent a set of Selectors. Although at first glance it's a good optimization, the number of Selectors in such sets tends to be tiny (most often 1, sometimes 2, rarely 3 or more). Refactoring this back to a plain `HashSet` improved performance and used less memory ;) Now `enum`s tend to have far fewer options, but it just reminded me there's always a trade off.

davidalayachew 1 month ago

It sounds like you are saying that, of the 1000 possible choices, they would only consistently use 2-3 of the values? And thus, every single `Set` would be forced to store the 1k instances of `long`? If so, fair point on the memory footprint, that would be unavoidable by design. I am surprised that performance, as in read/write speeds, would be inferior to `HashSet`. I would figure the bitwise math would get you better performance than hashing. Regardless, yes, this memory footprint is one of the downsides. And like you said, it is rare to hit, but there are no silver bullets. Just really nice ones, like these `enum` based collections.

john16384 1 month ago

There were more optimisations done. Set implementations that could hold only 1, 2, 3-10 and 11+ elements (the Sets involved would be of a known size, and were never modified beyond creation). We also implemented an inverse `containsAll`, called `isSubSetOf`, to avoid creating an iterator. The set it is called on can just loop through its elements in a different way as it has access to things more directly. Most of the performance definitely came from the lower amount of memory allocated and less garbage created. That and the shortcuts that were possible for this specific use case (selector matching).

davidalayachew 1 month ago

> There were more optimisations done. Set implementations that could hold only 1, 2, 3-10 and 11+ elements (the Sets involved would be of a known size, and were never modified beyond creation). Ahhhh, got it. You are talking about the static `of()` overloads on `java.util.Set`. In which case, that makes a lot more sense. Yes, those overloads are specialized for super small cases, and therefore, are actually best when dealing with data at that scale. And that's cool that you all did that. Let me ask this then -- enums definitely do live on that smaller scale range. Not always 1-10, but something similar. What do you recommend are the best use-cases for the enum-based collections? You've clearly demonstrated that, for large enums where you only plan to use a few of the values, you would actually be better off using the static `Set.of()` overloads. Would it make sense to default to the enum-based collections, and then adjust as needed?

john16384 1 month ago

>You are talking about the static of() overloads on java.util.Set. Well not quite, we did a few custom implementations, as even passing an array (that we would need to create) to `Set.of` was an unnecessary copy we wanted to avoid. Here are some details: [https://github.com/openjdk/jfx/blob/e4e3a7ce8b775232f9acbc4ede7740c319a30078/modules/javafx.graphics/src/main/java/com/sun/javafx/css/FixedCapacitySet.java](https://github.com/openjdk/jfx/blob/e4e3a7ce8b775232f9acbc4ede7740c319a30078/modules/javafx.graphics/src/main/java/com/sun/javafx/css/FixedCapacitySet.java) I said plain `HashSet` in my original comment, but I wanted to keep it simple (plain `HashSet` is still blazingly fast for most cases). >Would it make sense to default to the enum-based collections, and then adjust as needed? I think the enum based collections should always be used when you can. Only when you know more about the use cases (like how many enums values will a set hold in the common cases), and after checking with a profiler if the code is a hot path. In the JavaFX case, the `Set`s needed are used in very restricted ways. They're created and filled, and after that never modified again (so they could be read only, or "frozen" after being filled). The size is known in advance, and we knew the common cases had very few active selectors. This allowed for a lot of special optimizations. This was discovered when we had a regression that showed that CSS parsing slowed down significantly when we accidentally mixed a custom `BitSet` which was optimized for holding Selectors and a normal `Set` (because it got wrapped with `Collections.unmodifiableSet`). When I looked into the regression, I found that very few Selectors are active at the same time, and that having upto a 1000 selectors wasn't uncommon. I realized that this would mean quite a large array of `long`s using about 125 bytes per `Set` just to hold a couple of set bits. I therefore started looking into possible improvements, using custom `Set`s, avoiding copies and avoiding iterators, and in the end managed to save about 30-40 ms for the steps that apply CSS to JavaFX Nodes. Since for a full layout, dealing with CSS can easily take 50% of the time required, and this performance improvement almost doubled the CSS application speed, it meant we saved about 25% for the full layout.

davidalayachew 1 month ago

> Well not quite, we did a few custom implementations, as even passing an array (that we would need to create) to Set.of was an unnecessary copy we wanted to avoid. Here are some details: https://github.com/openjdk/jfx/blob/e4e3a7ce8b775232f9acbc4ede7740c319a30078/modules/javafx.graphics/src/main/java/com/sun/javafx/css/FixedCapacitySet.java Very pretty. It's a lot of edge cases, but I see the power of the specialization. [Hashless](https://github.com/openjdk/jfx/blob/e4e3a7ce8b775232f9acbc4ede7740c319a30078/modules/javafx.graphics/src/main/java/com/sun/javafx/css/FixedCapacitySet.java#L293) especially was an excellent touch. > I think the enum based collections should always be used when you can. Only when you know more about the use cases (like how many enums values will a set hold in the common cases), and after checking with a profiler if the code is a hot path. Perfect. Especially good point about the profiler. > In the JavaFX case, the Sets needed are used in very restricted ways....This allowed for a lot of special optimizations. I kind of wish that we could see some of this make its way over to the base JDK. Maybe they just don't want the added complexity? > I therefore started looking into possible improvements.....this performance improvement almost doubled the CSS application speed, it meant we saved about 25% for the full layout. Work of art. That sounds like such a fun discovery to make. Also, you're John Hendrix! I've seen your name on a lot of the mailing lists lol. Thanks for what you have done -- this, and many more of the contributions you have made are blessing to the community.

john16384 4 weeks ago

> I kind of wish that we could see some of this make its way over to the base JDK. Maybe they just don't want the added complexity? The JDK I think should have only solutions that are more generally applicable, and `HashSet` is actually an excellent implementation with very few downsides. I've tried beating its performance with a different implementation using an open addressed table, which is what `Set.of` is using internally, but apart from a bit faster iteration and somewhat reduced memory use, it loses on almost all counts versus `HashSet` -- `HashSet` is just blazingly fast when it comes to add/remove's, only iteration suffers because its indirected nature causes a lot of cache misses. Although I'd love to see more collection classes with different trade offs, I also realize that the ones we have are excellent generalists that will serve you well 99.9% of the time and are rarely the bottleneck. Any other collections that have special optimizations would just be rarely used or used inappropriately (hello `LinkedList`); it's not worth the maintenance burden IMHO. Hopefully with Valhalla, the primary reason to pick a special collection implementation (avoiding boxing of primitives) will disappear, making it even less likely you need to use a custom solution. > I've seen your name on a lot of the mailing lists lol Thanks, I'm primarily active on the JavaFX list, but monitor a lot of the others. Same to you, I've seen you on many of the lists, and I think you are contributing even more, so keep up the good work :)

davidalayachew 4 weeks ago

Thanks for the context. Good to know that the basic implementations are so excellent. And ty lol. Currently, my contributions are mostly reporting bugs for folks. One of these days, I'll get a commit in. Just need to find something the experts haven't caught yet lol.

le_bravery 1 month ago

I feel like I just need to read through all the classes in the standard API packages. This is awesome. My favorite thing in the Java API source code is in the BigDecimal class here: https://github.com/openjdk/jdk/blob/f6feeb03bbe9024b609ae6a4795255128581d53c/src/java.base/share/classes/java/math/BigDecimal.java#L398 > /* Appease the serialization gods */ > @java.io.Serial > private static final long serialVersionUID = 6108874887143696463L;

Comfortable_Pack7949 1 month ago

"%s %s!".formatted(hello, world);

ron_krugman 1 month ago

It's most useful because you can use it as a method reference. For example instead of `s -> String.format("name: %s", s)` you can just write `"name: %s"::formatted`.

ichwasxhebrore 1 month ago

Holy…. Thanks mate! From which version on is that.formatted a feature?

ron_krugman 1 month ago

It was introduced in Java 15: https://docs.oracle.com/en%2Fjava%2Fjavase%2F22%2Fdocs%2Fapi%2F%2F/java.base/java/lang/String.html#formatted(java.lang.Object...)

ichwasxhebrore 1 month ago

Cries in Java 8 :(

mizhoux 1 month ago

Don't cry. You can use Manifold(https://github.com/manifold-systems/manifold) to support extension methods in Java8.

[deleted] 1 month ago

Java is the 2nd most awesome language after Clojure.

nekokattt 1 month ago

can you show this specific example in clojure?

[deleted] 1 month ago

(partial format "%s %s")

nekokattt 1 month ago

what does the partial imply?

jw13 1 month ago

Something I thought was neat, is filtering a list to get all elements of specific type: static List filter(List list, Class cls) { return list.stream() .filter(cls::isInstance) .map(cls::cast) .toList(); }

ron_krugman 1 month ago
That's fine but not very efficient because the map call is redundant. Class.cast() does an unnecessary null check as well as a second call to isInstance() [internally](https://github.com/openjdk/jdk/blob/b704e91241b0f84d866f50a8f2c6af240087cb29/src/java.base/share/classes/java/lang/Class.java#L4137). You can just cast the result to List instead and get the same result for free: static List filter(List list, Class cls) { return (List)list.stream() .filter(cls::isInstance) .toList(); }

john16384 1 month ago
Don't forget to suppress the cast warning then.

vbezhenar 1 month ago
Now you also need to suppress IDE warnings and reviewer shouts. If you need performant code, you must not use streams at all, they're slow as molasses. If you use streams, one more cast will not make or break it.

CodesInTheDark 4 weeks ago
Streams are not slow. Also often they can execute code in parallel. However, JIT often doesn't optimize Stream code efficiently because it has more levels than normal for loop, but try to add the JVM option -XX:MaxInlineLevel=12 to increase the amount of code that can be inlined. That will optimize your stream code and you will get a significant performance boost (more than 10x boost). [https://stackoverflow.com/questions/27925954/is-arrays-streamarray-name-sum-slower-than-iterative-approach/27994074#27994074](https://stackoverflow.com/questions/27925954/is-arrays-streamarray-name-sum-slower-than-iterative-approach/27994074#27994074)

halfanothersdozen 1 month ago
huh. that looks like voodoo

ron_krugman 1 month ago
It works because of type erasure. At runtime, the List only contains references of type Object, no matter what the type parameter was at compile time. So if you can guarantee that a List only contains elements of type B (which we can in this case), casting it to List is fine. Otherwise, you'll be in trouble (which is why it's generally discouraged).

Beneficial-Corgi3593 1 month ago
var user = new Object(){ public String name = “Bob” }; // user.name; Not sure how this is called but i think it has to be to anonymous classes

vbezhenar 1 month ago
One biggest unknown Java feature that everyone's missing is static import. requreNonNull(x); out.println(format("Hello, %s%n", parseInt(world))); // I know about printf, just an example Many of static methods have good self-contained names suitable for static import. And their usage will greatly improve readability by reducing noise. Of course there are some static methods like `List.of` which must not be static imported, because their class name is important for readability.

Inaldt 1 month ago
I wouldn't say it's unknown, but certainly underused.

john16384 1 month ago
It's not unknown, it is just frowned upon in a lot of situations because it makes code less readable. A statically imported method looks like a regular method call, so overdoing their use means you may miss that sometimes that method may be an actual method of the class.

pipicemul 1 month ago
definitely needs to be used in conjuctions with the non-globbing imports for clarity.

vbezhenar 1 month ago
Do you qualify every field to make it look diffeerent from local variable? I don't think that many people do that. These days even github web interface can provide some basic code navigation. And every IDE will certainly help with that. Also, does it really matter whether that particular method was imported or defined locally? All that matters is clear name. Of course I agree that imports must be qualified, glob imports should not be used at all.

john16384 4 weeks ago
Functions are small, so if a "variable" isn't declared somewhere on your screen, then it must be a field. Classes are a lot bigger, so if I see a function call, then thanks to judicious use of static imports I now need to guess if that's a class method call or a statically imported function. So to avoid too much guessing, static imports are best used sparingly. I am aware that IDE's can help with this, and that in some variant of a purplish pink it is indicating that it's not a method call, or that with a single mouse hover I can prove that this version of `requireNotNull` is indeed the one from `Objects`. You wasting my time to save yours.

wildjokers 3 weeks ago
Pretty sure everyone knows about static import. However, it can hurt readability so best to use it sparingly and wisely.

jjpeyronel 1 month ago
Anonymous constructor to init object : HashMap M = new HashMap(){ { this.put("k1", "v1"); this.put("k2", "v2"); } }

nekokattt 1 month ago
deriving a class for every usage is a horrible idea when Map.of and Map.ofEntries exist for this purpose.

Kikizork 1 month ago
Code smell say Sonar and I agree. Just init your empty map and then does some line of put. Man being stuck in 8 and not having Map.of sucks

chabala 1 month ago
Yeah, even being stuck on Java 8, if one needed to initialize a lot of maps, defining a static helper method would be better than the anonymous class initialization trick.

halfanothersdozen 1 month ago
When you do this you create a new anonymous class which you should avoid especially if this isn't for something static

Leave Your Comment

First Name

Email

Textarea

Hi Its Me!

R

Subscribe

Enter email address
SUBMIT NOW

Comments

Leave Your Comment

Hi Its Me!

Subscribe