Browse Source

Treat F5-FF octets as single (invalid) characters

This corresponds to the newest reading of RFC 3629, and results
in the largest possible number of character entities by any
valid parser. This may result in a buffer which is oversized,
but never undersized.

This is after further discussion with acozzette in this PR;
https://github.com/protocolbuffers/protobuf/pull/6844

Signed-off-by: William A Rowe Jr wrowe@pivotal.io
Signed-off-by: Yechiel Kalmenson ykalmenson@pivotal.io
William A Rowe Jr 5 years ago
parent
commit
53a814a0ee
1 changed files with 1 additions and 1 deletions
  1. 1 1
      src/google/protobuf/stubs/strutil.cc

+ 1 - 1
src/google/protobuf/stubs/strutil.cc

@@ -2292,7 +2292,7 @@ static const unsigned char kUTF8LenTbl[256] = {
   1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,
   1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,
   2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2,
-  3,3,3,3,3,3,3,3, 3,3,3,3,3,3,3,3, 4,4,4,4,4,4,4,4, 5,5,5,5,6,6,1,1
+  3,3,3,3,3,3,3,3, 3,3,3,3,3,3,3,3, 4,4,4,4,4,1,1,1, 1,1,1,1,1,1,1,1
 };
 
 // Return length of a single UTF-8 source character