Java long string constants
I said that I am proficient in strings, and the interviewer asked me if there is any length limit for String in Java?
String length limit
To clarify this problem, we first need to look through the source code of String. There are many overloaded constructors in the String class, several of which support users to pass in length to execute length. ;
public String(byte bytes[], int offset, int length)
It can be seen that the parameter length is defined using the int type. In other words, when String is defined, the maximum supported length is the maximum range value of the int type.
According to the definition of the Integer class, the maximum value of java.lang.Integer#MAX_VALUE is 2^31-1;
So, can we think that the maximum length that String can support is this value?
In fact, it is not. This value is only a maximum length that can be supported when we construct a String at runtime. In fact, there is a length limit when defining a string at runtime.
The following code:
String s="11111. 1111"; //There are one hundred thousand characters "1";
When we use the above form to define a string, when executing javac compilation, an exception will be thrown, and the prompt is as follows:
Error: The constant string is too long
Obviously the length specified by the String constructor can support 2147483647 (2^31-1), why the above form cannot be compiled?
In fact, the shape is like String s=»xxx». When we define String, xxx is called a literal. This literal will enter the Class constant pool as a constant after compilation.
The problem comes, because to enter the constant pool, according to the regulations of the constant pool.
Constant pool limit
Generally speaking, javac is a command to compile java files into class files, so certain specifications need to be followed during the process of class file generation.
According to the definition of constant pool in chapter 4.4 of the «Java Virtual Machine Specification», CONSTANT_String_info is used to represent a constant object of type java.lang.String, and the format is as follows:
CONSTANT_String_info u1 tag; u2 string index; >
Among them, the value of the string_index item must be a valid index to the constant pool, and the item at the index of the constant pool must be. CONSTANT_UTF8_info structure, this set of Unicode code point sequences will eventually be initialized as String objects, and the CONSTANT_utf8_info structure is used to represent the value of the string constant pool.
CONSTANT_Utf8_info u1 tag; u2 length; u1 byte[length] >
Among them, length specifies the length of the byte[] array, the type is u2,
By reading the specification, we can be familiar with that u2 represents an unsigned number of two bytes, then one byte is 8 bits, and two bytes are 16 bits.
The maximum value of a 16-bit unsigned number is 2^16-1=65535.
In other words, the format of the constant pool in the Class file is formulated, and the length of the string cannot exceed 65535.
We can define the string in the following way:
String s="11111. 1111"; //There are 6535 characters "1";
Try to use javac to compile, you will also get «error, constant string is too long», what is the reason?
Continue to look at the javac code, the code in the Gen class is as follows:
private void checkStringConstant(DiagnosticPosition var1, Object var2) if (this.nerrs == 0 && var2 != null && var2 instanceof String && ((String)var2).length() >= 65535) this.log.error(var1, "limit.string", new Object[0]); ++this.nerrs; > >
It can be seen from the code that when the parameter type is String and the length is greater than or equal to 65535, the compilation will fail.
You can try to debug the javac compilation process in this place (there is a method to debug the java compilation process in the video), you can also find that this place will report an error.
If we try to define a string with 65534 characters, we will find that it can compile normally.
Actually, this value is also explained in the «Java Virtual Machine Specification»:
if the Java Virtual Machine code for a method is exactly 65535 bytes long and ends with an instruction that is 1 byte long, then that instruction cannot be protected by an exception handler. A compiler writer can work around this bug by limiting the maximum size of the generated Java Virtual Machine code for any method, instance initialization method, or static initializer (the size of any code array) to 65534 bytes
Runtime restrictions
The limitation of String length mentioned above is a limitation at compile time, which is limited only when using String s=»» literal definition.
Well. Is there any restriction on String during runtime? The answer is yes, that is the Integer.MAX_VALUE we mentioned earlier, this value is approximately equal to 4G, during runtime, if the length of String exceeds this range, an exception may be thrown. (Before jdk 1.9)
int is a 32-bit variable type. If you take the positive part to calculate, they can have up to
2^31-1 =2147483647 A 16-bit Unicodecharacter 2147483647 * 16 = 34359738352 Bit 34359738352 / 8 = 4294967294 (Byte) 4294967294 / 1024 = 4194303.998046875 (KB) 4194303.998046875 / 1024 = 4095.9999980926513671875 (MB) 4095.9999980926513671875 / 1024 = 3.99999999813735485076904296875 (GB)
There is nearly 4G capacity.
Many people have doubts. The maximum length is required to be less than 65535 when compiling. How can it be greater than 65535 during runtime. This is actually very common, such as the following code:
String s = ""; for (int i = 0; i 100000 ; i++) s+="i"; >
The length of the string obtained is 100,000, and I have encountered this problem in actual applications before.
In the previous system docking, high-definition pictures need to be transmitted. The agreed transmission method is that the other party converts the pictures to BASE6 encoding, and we convert them to pictures after receiving them.
An exception was thrown when assigning the BASE64 encoded content to the string.
Strings have length limitations. During compile time, the constants in the string constant pool are required to not exceed 65535, and the maximum value is controlled to 65534 during the execution of javac.
During runtime, the length cannot exceed the range of Int, otherwise an exception will be thrown.
Handling Large String Constants in Java
What is the best way to handle large string constants in Java? Imagine that I have a test fixture for SOAP and I want to send the following string:
BN325 B2B3 B08A AP3U86V flightline 0 CarParking lgw 21-Jun-2005 07:00 28-Jun-2005 07:00 1
I’d rather not put quotes and pluses around every line. If I put it in a file it’s extra code and it would be somewhat hard to put several strings in the same file. XML has problems escaping text (I have to use CDATA ugliness). Is there an easier way?
You can put it in properties file. Still have to write code, but not that much and you can put more than one string in it
[rant]I can’t believe Java still hasn’t fixed this. It’s one of the main reasons that Java sucks for so many applications. Why is it so hard for them to get this?[/rant]2 Answers 2
If the strings are unrelated, you could put them in separate files even if it’s a lot of files (what is the problem with that?).
If you insist on one file, you could come up with a unique delimiter, but you would be paying a price when attempting to randomly access a specific entry.
Data files should almost always be externalized (likely in a separate directory) and read when needed, rather than hardcoded into the code. It’s cleaner, reduces code size, reduces need for compilation, and allows you to use the same data file for multiple test. Most test fixtures as well as build and integration tools support external files.
Or, you could write code or a builder that builds SOAP from arguments, making this all a lot more concise (if you’re willing to pay the runtime cost). (Correction: I see you changed your sample, this would be nasty to auto-generate).
Java «constant string too long» compile error. Only happens using Ant, not when using Eclipse
I have a few really long strings in one class for initializing user information. When I compile in Eclipse, I don’t get any errors or warnings, and the resulting .jar runs fine. Recently, I decided to create an ant build file to use. Whenever I compile the same class with ant, I get the «constant string too long» compile error. I’ve tried a number of ways to set the java compiler executable in ant to make sure that I’m using the exact same version as in Eclipse. I’d rather figure out how to get the same successful compile I get in Eclipse in Ant than try to rework the code to dynamically concatenate the strings.
your string is too long, as you may realize. as a hack you can split it into multiple strings in your source code and concatenate them. this is what the eclipse java compiler is doing on your behalf.
12 Answers 12
Someone is trying to send you a message 🙂 In the time you’ve spend fiddling with compiler versions you could have loaded the data from a text file — which is probably where it belongs.
I found I could use the apache commons lang StringUtils.join( Object[] ) method to solve this.
public static final String CONSTANT = org.apache.commons.lang.StringUtils.join( new String[] < "This string is long", "really long. ", "really, really LONG. " >);
I realised that this answer doesn’t really answer the original posters question, it is only a workaround. I wouldn’t recommend it for production code, only for quick and dirty testing purposes.
Nothing of above worked for me. I have created one text file with name test.txt and read this text file using below code
String content = new String(Files.readAllBytes(Paths.get("test.txt")));
This is the best choice I guess, to store the content in a separate file instead of in java source code itself.
I had a bit trouble getting this code to work. See howtodoinjava.com/java/io/java-read-file-to-string-examples for the full code with .readAllBytes and the stuff you have to import — it helped me.
The length of a string constant in a class file is limited to 2^16 bytes in UTF-8 encoding, this should not be dependent on the compiler used. Perhaps you are using a different character set in your ant file than in eclipse, so that some characters need more bytes than before. Please check the encoding attribute of your javac task.
A workaround is to chunk your string using new String() (yikes) or StringBuilder , e.g.
String CONSTANT = new String("first chunk") + new String("second chunk") + . + new String(". ");
String CONSTANT = new StringBuilder("first chunk") .append("second chunk") .append(". ") .toString();
These can be viable options if you’re producing this string e.g. from a code generator. The workaround documented in this answer where string literals are concatenated no longer works with Java 11
String theString2 = IOUtils.toString(new FileInputStream(new File(rootDir + "/properties/filename.text")), "UTF-8");
Another trick, if I’m determined to put a long string in the source, is to avoid the compiler detecting it as a constant expression.
String dummyVar = ""; String longString = dummyVar + "This string is long\n" + "really long. \n" + "really, really LONG. ";
This worked for a while, but if you keep going too far the next problem is a stack overflow in the compiler. This describes the same problem and, if you’re still determined, how to increase your stack — the problem seems to be the sheer size of the method now. Again this wasn’t a problem in Eclipse.
I was able to resolve this issue in a similar way like Lukas Eder.
String testXML = "REALLY LONG STRING. "; textXML += "SECOND PART OF REALLY LONG STRING. "; textXML += "THIRD PART OF REALLY LONG STRING. ";
Just split it up and add it together;
Did you try this? Never tried it myself, but here is the relevant section:
Using the ant javac adapter The Eclipse compiler can be used inside an Ant script using the javac adapter. In order to use the Eclipse compiler, you simply need to define the build.compiler property in your script. Here is a small example.
/src"/> /../org.eclipse.jdt.core/bin"/>
I would really consider making your classes standards compatible. I believe the official limit is 65535, and the fact that Eclipse is more lenient is something that could change on you at the most inconvenient of times, and either way constantly having to get the project compiled with Eclipse can really start to limit you in too many ways.