Forum: >>> Magnum BBS <<<

Re: Convert fom \uXXXX to %XX%XX

From servoloro@21:1/5 to servoloro on Wed Feb 8 10:33:09 2023

Sorry it is:

s=s.replaceAll("\\\\u00dc", "%C3%9C");

On 2/8/23 10:31, servoloro wrote:

*Newbie question*
I have to convert a string from the format (how it's called ?)
\uXXXX
to (again:how it's called ?)
%XX%XX
i.e. from \u00dc to %C3%9C.
Apart from doing a dumb replaceAll
I'm sure there is a smarter way.
Not knowing the names of the formats Google didn't help me :-(
Could someone give me hints/directions ?
TIA

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From servoloro@21:1/5 to All on Wed Feb 8 10:31:50 2023

*Newbie question*
I have to convert a string from the format (how it's called ?)
\uXXXX
to (again:how it's called ?)
%XX%XX
i.e. from \u00dc to %C3%9C.
Apart from doing a dumb replaceAll
s=s.replaceAll("\u00dc", "%C3%9C");
I'm sure there is a smarter way.
Not knowing the names of the formats Google didn't help me :-(
Could someone give me hints/directions ?
TIA

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From e.d.programmer@gmail.com@21:1/5 to All on Wed Feb 8 04:28:36 2023

*Newbie question*
I have to convert a string from the format (how it's called ?)
\uXXXX
to (again:how it's called ?)
%XX%XX
i.e. from \u00dc to %C3%9C.
Apart from doing a dumb replaceAll
s=s.replaceAll("\u00dc", "%C3%9C");
I'm sure there is a smarter way.
Not knowing the names of the formats Google didn't help me :-(
Could someone give me hints/directions ?
TIA

If you google "\u00dc" you'll see it's called unicode, expressed as a Java String, that code specifically being "latin capitol letter U with diaersis".
Note if you want to replace all occurrences of a single string within a string, call the .replace method. Use .replaceAll if you need the replacement value to be a regex.
If you google "%C3%9C" you'll see it's also unicode, expressed in url encoding. If you google "java unicode url encode" you'll see some different ways to do that, depending on your use case. Is it for a domain name? query string parameter? web page label value? is there a framework? more context is required to get specific on the
solution.
If you know that's the only character you'll need to convert, the replace method could suffice, otherwise you'll likely want to call an API encode method.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From servoloro@21:1/5 to e.d.pro...@gmail.com on Wed Feb 8 14:53:16 2023

On 2/8/23 13:28, e.d.pro...@gmail.com wrote:

If you google "\u00dc" you'll see it's called unicode, expressed as a Java String, that code specifically being "latin capitol letter U with diaersis".
...

THANKS !

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?Q?Arne_Vajh=c3=b8j?=@21:1/5 to servoloro on Wed Feb 8 10:23:50 2023

On 2/8/2023 4:33 AM, servoloro wrote:

On 2/8/23 10:31, servoloro wrote:

*Newbie question*
I have to convert a string from the format (how it's called ?)
\uXXXX
to (again:how it's called ?)
%XX%XX
i.e. from \u00dc to %C3%9C.
Apart from doing a dumb replaceAll
I'm sure there is a smarter way.
Not knowing the names of the formats Google didn't help me :-(
Could someone give me hints/directions ?

Sorry it is:

s=s.replaceAll("\\\\u00dc", "%C3%9C");

There are a lot complications here.
- "\u00dc" is 1 char but "\\u00dc" is 6 chars
- you seems to have an implicit assumption about UTF-8 encoding
- the type of encode is generally known as URL encode, but
there is some ambiguity in that like whether you want
spaces as is or converted to plus sign

But the code below should illustrate a lot.

Arne

import java.io.UnsupportedEncodingException;
import java.net.URLEncoder;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class UFun {
private static String encode_hack(String s) {
return s.replace("\u00dc", "%C3%9C")
.replace("\u00c6", "%C3%86")
.replace("\u00d8", "%C3%98")
.replace("\u00c5", "%C3%85")
.replace("\u00e6", "%C3%A6")
.replace("\u00f8", "%C3%B8")
.replace("\u00e5", "%C3%A5");
}
private static String encode_manual(String s) throws UnsupportedEncodingException {
StringBuilder sb = new StringBuilder();
for(byte b : s.getBytes("UTF-8")) {
if(32 <= b && b < 127) {
sb.append((char)b);
} else if (0 <= b && b < 10) {
sb.append("%0");
sb.append(Integer.toHexString(b).toUpperCase());
} else {
sb.append('%');
sb.append(Integer.toHexString(b & 0xFF).toUpperCase());
}
}
return sb.toString();
}
private static String encode_builtin(String s) throws UnsupportedEncodingException {
return URLEncoder.encode(s, "UTF-8").replace("+", "
").replace("%3A", ":");
}
private static void test1(String s) throws
UnsupportedEncodingException {
String s2a = encode_hack(s);
System.out.printf("%s -> %s\n", s, s2a);
String s2b = encode_manual(s);
System.out.printf("%s -> %s\n", s, s2b);
String s2c = encode_builtin(s);
System.out.printf("%s -> %s\n", s, s2c);
}
private static final Pattern p =
Pattern.compile("\\\\u([0-9A-Fa-f]{4})");
private static String decode(String s) {
Matcher m = p.matcher(s);
StringBuffer res = new StringBuffer();
while (m.find()) {
m.appendReplacement(res, Character.toString((char) Integer.parseInt(m.group(1), 16)));
}
m.appendTail(res);
return res.toString();
}
private static String decode_encode_hack(String s) {
return encode_hack(decode(s));
}
private static String decode_encode_manual(String s) throws UnsupportedEncodingException {
return encode_manual(decode(s));
}
private static String decode_encode_builtin(String s) throws UnsupportedEncodingException {
return encode_builtin(decode(s));
}
private static void test2(String s) throws
UnsupportedEncodingException {
String s2a = decode_encode_hack(s);
System.out.printf("%s -> %s\n", s, s2a);
String s2b = decode_encode_manual(s);
System.out.printf("%s -> %s\n", s, s2b);
String s2c = decode_encode_builtin(s);
System.out.printf("%s -> %s\n", s, s2c);
}
public static void main(String[] args) throws
UnsupportedEncodingException {
test1("This is \u00dc and Ü and Danish: ÆØÅæøå");
test2("This is \\u00dc and Ü and Danish: ÆØÅæøå");
}
}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Plume
  Sun Sep 14 09:34:52 2025
  from Uk via Raw
- Gretchiie
  Sun Sep 14 06:07:30 2025
  from Derry, Nh via Telnet
- Thlc
  Sat Sep 13 17:11:34 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 17:04:03 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 16:32:19 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 15:41:11 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 07:56:03 2025
  from Rognac, France via SSH
- Gretchiie
  Sat Sep 13 07:22:10 2025
  from Derry, Nh via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (0 / 16)
Uptime:	166:25:57
Calls:	10,385
Calls today:	2
Files:	14,057
Messages:	6,416,528

Re: Convert fom \uXXXX to %XX%XX

Who's Online

Recent Visitors

System Info