Time series naming: Encoding issue with Umlaut in results of x13/requests #13

suweg · 2018-05-08T10:58:39Z

Hi,

This issue is related to the bug fixed in #10. In 2.2.1 the timeseries are not renamed into "series1" ... anymore. Thanks for that fix.

However, there seems to be a problem with the encoding of Umlauts in the timeseries names of the resulting xml-file.

This is a snippet of the x13requests.xml I am sending. The encodig is correctly set to UTF-8 in the header:

<x13:Series name="Auftragsbestand Inland / in jeweiligen Preisen / Deutschland / Konsumgüter">

The umlaut ü is correctly encoded as u'\u00fc'.

And this is what I receive from the WS:
<tss:item name="Auftragsbestand Inland / in jeweiligen Preisen / Deutschland / Konsumg端ter">
The unicode character changed into u'\u7aef', a Chinese character.

I am direcly fetching the results. When I try to print the response, I am getting an UnicodeEcodeError.

Any idea about this?

Regards,
Susanne

The text was updated successfully, but these errors were encountered:

maggima · 2019-04-09T09:24:28Z

Dear Susanne,

Can you give me further information about which call are you making to the webservice please ?
Is the request made from Java or another language ? What's the calling source code ? Are you adding special headers ?

I don't have encoding problems calling the service locally on my machine.
But I have a solution in mind adding on the service a filter before returning the response which adds the encoding to the Content-Type :

public class CharsetResponseFilter implements ContainerResponseFilter {

    @Override
    public void filter(ContainerRequestContext request, ContainerResponseContext response) {
        MediaType type = response.getMediaType();
        if (type != null) {
            if (!type.getParameters().containsKey(MediaType.CHARSET_PARAMETER)) {
                MediaType typeWithCharset = type.withCharset("utf-8");
                response.getHeaders().putSingle("Content-Type", typeWithCharset);
            }
        }
    }
}

And then you can register it in the ApplicationConfig.java :

resources.add(ec.nbb.ws.filters.CharsetResponseFilter.class);

suweg · 2019-04-11T09:18:45Z

Dear Mats,

I am sending the request with a Python script using the following code line and then saving it in the next one:

    response = requests.post(url=ws_server_url+x13_api_point, headers=headers, data=x13reqFile)
    with codecs.open(response_filename, 'w', 'utf-8') as f:
        f.write(response.text)

We generate the requests (x13reqFile) with the CLI-tools and they only have the standard header.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

Actually, we didn't encounter this issue any more. It was indirectly fixed with " Fixed ts naming in requests. #5 ". Now we don't have ts-names with non ascii characters any more.

But I did a test and we still would have the problem otherwise:
I sent a request with the time series name " pdp310fä" and when I opened the result in Notepad++ (set to UTF-8, of course), it looked like this: "pdp310fÃ¤". No error was thrown though.

Does this answer your questions?

maggima · 2019-04-30T08:15:37Z

It would really useful to get one of your files that causes the problem so I can test better during which operation we have that encoding issue (request, response, reading the file,...)

charphi assigned maggima May 18, 2018

charphi added the bug label May 18, 2018

suweg mentioned this issue Jun 5, 2020

Timeseries dissapears #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time series naming: Encoding issue with Umlaut in results of x13/requests #13

Time series naming: Encoding issue with Umlaut in results of x13/requests #13

suweg commented May 8, 2018

maggima commented Apr 9, 2019

suweg commented Apr 11, 2019

maggima commented Apr 30, 2019

Time series naming: Encoding issue with Umlaut in results of x13/requests #13

Time series naming: Encoding issue with Umlaut in results of x13/requests #13

Comments

suweg commented May 8, 2018

maggima commented Apr 9, 2019

suweg commented Apr 11, 2019

maggima commented Apr 30, 2019