[Using Sakai] Short URLs for files with non ASCII characters

Rafael Morales Gamboa rmorales at suv.udg.mx
Sun Aug 18 22:13:07 PDT 2013


Well, I followed some advice (
http://stackoverflow.com/questions/3513773/change-mysql-default-character-set-to-utf8-in-my-cnf)
and found that

BEFORE

mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

mysql> show variables like 'collation%';
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database   | utf8_general_ci   |
| collation_server     | utf8_general_ci   |
+----------------------+-------------------+
3 rows in set (0.00 sec)

Then I added the following lines in my.cnf (recommended for 5.1, not for
5.5):

skip-character-set-client-handshake
collation-server=utf8_unicode_ci
character-set-server=utf8

and now I have

AFTER

mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

mysql> show variables like 'collation%';
+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_unicode_ci |
| collation_database   | utf8_general_ci |
| collation_server     | utf8_unicode_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)

So the client and the connection were wrongly configured, but the database
and server were right. I hope that means that the database is rightly
configured but some content may be wrongly encoded. So I deleted my file
and uploaded it again, but I found the same problem with the URL shortener.

Any ideas?


2013/8/18 Rafael Morales Gamboa <rmorales at suv.udg.mx>

> According to MySQL documentation ():
>
> SHOW CREATE {DATABASE | SCHEMA} [IF NOT EXISTS] *db_name*
>
> Shows the CREATE DATABASE<http://dev.mysql.com/doc/refman/5.0/en/create-database.html> statement
> that creates the named database. If the SHOW statement includes an IF NOT
> EXISTS clause, the output too includes such a clause. SHOW CREATE SCHEMA<http://dev.mysql.com/doc/refman/5.0/en/show-create-database.html> is
> a synonym for SHOW CREATE DATABASE<http://dev.mysql.com/doc/refman/5.0/en/show-create-database.html> as
> of MySQL 5.0.2.
>
> and I get
>
> show create database sakai;
>
> +----------+----------------------------------------------------------------+
> | Database | Create Database
>  |
>
> +----------+----------------------------------------------------------------+
> | sakai    | CREATE DATABASE `sakai` /*!40100 DEFAULT CHARACTER SET utf8
> */ |
>
> +----------+----------------------------------------------------------------+
>
>
> So I guess our database was created as UTF-8.
>
>
>
> 2013/8/18 Rafael Morales Gamboa <rmorales at suv.udg.mx>
>
>> It looks like the database was setup as UTF-8 from the beginning, because
>> that is said in the instructions to install Sakai CLE 2.9. I do not
>> remember changing it to UTF-8 after installation. The fact is that the full
>> URL uses UTF-8, but the shortener encodes it as Latin 1.
>>
>> Would there be any way to test if the database was not created as UTF-8?
>> Could someone who is totally sure its database was created as UTF-8 try to
>> reply my case? It would require to upload a document from Windows with an
>> 'ó' (accute o) in its name, and then generate a short URL for it.
>>
>>
>> 2013/8/18 Steve Swinsburg <steve.swinsburg at gmail.com>
>>
>>> If your database was not setup as UTF-8 to begin with, then you may need
>>> to export it all out and re-import it with the correct character set. I
>>> haven't done this though.
>>>
>>>
>>> On Mon, Aug 19, 2013 at 9:56 AM, Rafael Morales Gamboa <
>>> rmorales at suv.udg.mx> wrote:
>>>
>>>> I searched the url_randomised_mappings_t, ran the results through od,
>>>> and found that
>>>>
>>>> 0000000   1   1  \t   H   a   q   X   O   C  \t   h   t   t   p   :   /
>>>> 0000020   /   v   i   r   t   u   a   l   .   c   u   d   i   .   e   d
>>>> 0000040   u   .   m   x   :   8   0   8   0   /   a   c   c   e   s   s
>>>> 0000060   /   c   o   n   t   e   n   t   /   g   r   o   u   p   /   8
>>>> 0000100   b   f   7   3   c   8   c   -   0   9   6   8   -   4   9   1
>>>> 0000120   4   -   8   c   4   4   -   9   1   8   d   7   6   f   a   3
>>>> 0000140   8   9   6   /   I   n   t   r   o   d   u   c   c   i 363   n
>>>> 0000160       a       S   a   k   a   i       -       P   e   r   s   p
>>>> 0000200   e   c   t   i   v   a       d   e   l       U   s   u   a   r
>>>> 0000220   i   o   .   p   d   f  \n
>>>>
>>>> So the URL is stored using #f3 (octal 363) for the 'ó' (accute o). In
>>>> other words, Sakai/MySQL is using the ISO 8859-1 encoding instead of UTF-8
>>>> to store shortened the URL.
>>>>
>>>> Regards,
>>>> Rafael
>>>>
>>>>
>>>> 2013/8/18 Steve Swinsburg <steve.swinsburg at gmail.com>
>>>>
>>>>> Go to the database and see what the full url is for this shortened
>>>>> URL. Does it match the actual resource URL?
>>>>>
>>>>> Cheers,
>>>>> Steve
>>>>>
>>>>> Sent from my iPad
>>>>>
>>>>> On 17/08/2013, at 7:23, Rafael Morales Gamboa <rmorales at suv.udg.mx>
>>>>> wrote:
>>>>>
>>>>> Actually, I was wrong. Tomcat is configured with URIEncoding="UTF-8"
>>>>> and the database seems to be OK (despite de fact that there is no
>>>>> character_set_server=utf8 in my.conf:
>>>>>
>>>>> mysql> SELECT default_character_set_name FROM
>>>>> information_schema.SCHEMATA S
>>>>>     -> WHERE schema_name = "sakai";
>>>>> +----------------------------+
>>>>> | default_character_set_name |
>>>>> +----------------------------+
>>>>> | utf8                       |
>>>>> +----------------------------+
>>>>> 1 row in set (0.00 sec)
>>>>>
>>>>> The question is whether I should add that configuration line to
>>>>> my.conf, anayway.
>>>>>
>>>>> Regards,
>>>>> Rafael
>>>>>
>>>>> El 15/08/2013 06:50 p.m., Mike DeSimone escribió:
>>>>>
>>>>>  Hi Rafael,
>>>>>
>>>>>  Sometimes things like this are due to database settings and tomcat
>>>>> connector settings.  Can you confirm the 8080 connector
>>>>> has URIEncoding="UTF-8"?  Also, if this is MySQL, then ensure this is in
>>>>> the my.cnf: character_set_server=utf8 and the database was created with
>>>>> utf8 encoding.
>>>>>
>>>>>  more info can be found here:
>>>>> https://confluence.sakaiproject.org/display/DOC/Sakai+CLE+2.9+Release+Notes#SakaiCLE29releasenotes-Databasesupport
>>>>> and:
>>>>> https://confluence.sakaiproject.org/pages/viewpage.action?pageId=82249313(section 3 for tomcat setup)
>>>>>
>>>>>  I also have found that files cannot easily be renamed in the Sakai
>>>>> UI.  Not sure if there's any other solution than your suggestion.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> *********************************************
>>>>> Mike DeSimone
>>>>> Director of Enterprise Operations
>>>>> Asahi Net International, Inc. (*ANI*)
>>>>> O: +1 (602) 490-0473
>>>>>  W: anisakai.com
>>>>>
>>>>>  [image: Inline image 1]
>>>>>   **********************************************
>>>>>
>>>>>
>>>>> On Wed, Aug 14, 2013 at 4:25 PM, Rafael Morales Gamboa <
>>>>> rmorales at suv.udg.mx> wrote:
>>>>>
>>>>>>  Hello,
>>>>>>
>>>>>> On Sakai 2.9.1 I uploaded a file to Resources that was called
>>>>>> 'Introducción a Sakai - Perspectiva del Usuario.pdf' in my computer, and I
>>>>>> renamed it to 'Introducción a Sakai.pdf' in Resources and made it public.
>>>>>> Under properties I can see now its URL:
>>>>>>
>>>>>>
>>>>>> http://virtual.cudi.edu.mx:8080/access/content/group/8bf73c8c-0968-4914-8c44-918d76fa3896/Introducci%C3%B3n%20a%20Sakai%20-%20Perspectiva%20del%20Usuario.pdf
>>>>>>
>>>>>> So, the URL is composed from the original name of the file in my
>>>>>> computer, and I cannot see a way to change it unless renaming the file in
>>>>>> my computer before uploading it to Resources.
>>>>>>
>>>>>> Now, if I choose the Short URL option I get the URL
>>>>>> http://virtual.cudi.edu.mx:8080/x/HaqXOC, and at least three
>>>>>> distinct behaviours in browser (latest public releases of them):
>>>>>>
>>>>>>    1. In Firefox, I get error 404.
>>>>>>    2. In Chrome, it is translated to
>>>>>>    http://virtual.cudi.edu.mx:8080/access/content/group/8bf73c8c-0968-4914-8c44-918d76fa3896/Introducci%F3n%20a%20Sakai%20-%20Perspectiva%20del%20Usuario.pdf('ó' is encoded as %F3 instead of %C%B3) and I get error 404.
>>>>>>    3. In Internet Explorer it works! ('ó' is just 'ó').
>>>>>>
>>>>>> The questions are (1) why do I get all this different behaviours? and
>>>>>> (2) how can I get it to work on all these browsers?
>>>>>>
>>>>>> Regards,
>>>>>> Rafael
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> sakai-user mailing list
>>>>>> sakai-user at collab.sakaiproject.org
>>>>>> http://collab.sakaiproject.org/mailman/listinfo/sakai-user
>>>>>>
>>>>>> TO UNSUBSCRIBE: send email to
>>>>>> sakai-user-unsubscribe at collab.sakaiproject.org with a subject of
>>>>>> "unsubscribe"
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> sakai-user mailing list
>>>>> sakai-user at collab.sakaiproject.org
>>>>> http://collab.sakaiproject.org/mailman/listinfo/sakai-user
>>>>>
>>>>> TO UNSUBSCRIBE: send email to
>>>>> sakai-user-unsubscribe at collab.sakaiproject.org with a subject of
>>>>> "unsubscribe"
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> sakai-user mailing list
>>>> sakai-user at collab.sakaiproject.org
>>>> http://collab.sakaiproject.org/mailman/listinfo/sakai-user
>>>>
>>>> TO UNSUBSCRIBE: send email to
>>>> sakai-user-unsubscribe at collab.sakaiproject.org with a subject of
>>>> "unsubscribe"
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai-user/attachments/20130819/544fa060/attachment-0001.html 


More information about the sakai-user mailing list