[Building Sakai] PDF uploads to Resources

Steve Swinsburg steve.swinsburg at gmail.com
Thu Sep 2 18:06:09 PDT 2010


To fix these I would suggest a Quartz job that finds all the affected resources, reads them out, updates the type then saves them again, all using the ContentHostingService API. Of course if it's only a handful it could be done manually. 

Cheers, 
Steve

Sent from my iPhone

On 03/09/2010, at 3:47, Omer Piperdi <omer at rice.edu> wrote:

> I modified the query a little.. Here is what I came up with..
> 
> SELECT a.resource_id, a.file_path, a.xml,
>        a.resource_uuid, a.binary_entity
>   FROM content_resource a
>   where upper(a.resource_id) like upper('%.pdf%')
>   and upper(a.resource_id) not like upper('%http:%')
>   and a.xml is null
>   and dbms_lob.getlength(a.binary_entity) < 2000
>   and 
> utl_raw.cast_to_varchar2(hextoraw(dbms_lob.substr(a.binary_entity))) 
> like '%text/url%'
> 
> Thanks
> Omer
> 
> On 9/2/2010 12:23 PM, Omer Piperdi wrote:
>> 
>> The query is good for the time being to see how many people having this
>> issue.. (But I saw "ORA-06502: PL/SQL: numeric or value error: raw
>> variable length too long", when I ran the query for all pdfs)
>> 
>> We also have pdfs uploaded as application/binary as well.. But it prompt
>> to choose a program to open at least.. text/url is
>> throwing 404.
>> 
>> Thanks again,
>> Omer
>> 
>> On 9/2/2010 11:54 AM, Matthew Jones wrote:
>>> Well there are queries you can run to *see*, but there's no straight SQL
>>> you can run to modify it. You'd have to write some code to do it.
>>> Unfortunately this data is encoded in a binary field (rather than text),
>>> which makes it faster to process but not possible to modify. This is in
>>> BINARY_ENTITY in CONTENT_RESOURCE. I wrote about this last February [1]
>>> and provided a query for Oracle. I don't know what it would be for
>>> Mysql. You can't run a string replace on this field because the length
>>> of each element is encoded within the string (with unprintable
>>> characters) and if anything is changed to be longer/shorter it will
>>> break when it reads it back. So you'd actually need to read
>>> to de-serialize it with code and write it back. The java that does this
>>> is in this link in DbContentService.java if you wanted to try, or it
>>> could probably also be done in some scripted language.
>>> 
>>> An example of binary to text for a pdf file is below . . . Unfortunately
>>> for this file it thinks that this was uploaded as application/binary as
>>> well, instead of pdf.
>>> 
>>> CHSBRE
>>> B/group/b11a03c0-b1a5-40d2-8617-63ad4aa968e9e/*LIS Tracking
>>> Sheet.pdf*)org.sakaiproject.content.types.fileUpload inherited����
>>> d e'http://purl.org/dc/elements/1.1/creatore DAV:getlastmodified
>>> 20100324000103169e DAV:get*contenttype application/binary*e
>>> SAKAI:content_priority
>>> 2e'http://purl.org/dc/elements/1.1/subjecte)http://purl.org/dc/elements/1.1/publishere!http://purl.org/dc/terms/abstracte+http://purl.org/dc/elements/1.1/alternativee
>>> CHEF:copyrightchoice I hold copyright.e
>>> CHEF:modifiedby$05d1fgf55-5qaw-4340-8f25-214a7e332097e!http://purl.org/dc/terms/audiencee
>>> DAV: . . .
>>> 
>>> [1]
>>> http://collab.sakaiproject..org/pipermail/sakai-dev/2010-February/005709.html
>>> <http://collab.sakaiproject.org/pipermail/sakai-dev/2010-February/005709.html>
>>> 
>>> On Thu, Sep 2, 2010 at 11:39 AM, Omer Piperdi<omer at rice.edu
>>> <mailto:omer at rice.edu>>  wrote:
>>> 
>>>     Is there a query that I can run against content_resource table and
>>>     see if resource_id has .pdf in it and resource type is not
>>>     application/pdf.
>>> 
>>>     Which column has file type info?
>>> 
>>>     Thanks
>>>     Omer
>>> 
>>> 
>>>     On 9/1/2010 5:07 PM, Matthew Jones wrote:
>>> 
>>>         Yea, it's a bug with firefox on some platforms, there is
>>>         currently no
>>>         fix for Sakai.
>>> 
>>>         There was a jira proposed (KNL-101) to use a file type detection
>>>         library
>>>         (like mime-util). However it *looked* like it involved changing some
>>>         api's in the kernel, and I haven't finished fixing it yet. It's
>>>         hopefully get to it to looking at it again before the 2.8
>>>         freeze, but
>>>         have a number of higher local priorities before then. :(
>>> 
>>>         -Matthew
>>> 
>>>         On Wed, Sep 1, 2010 at 5:54 PM, Omer Piperdi<omer at rice.edu
>>>         <mailto:omer at rice.edu>
>>>         <mailto:omer at rice.edu<mailto:omer at rice.edu>>>  wrote:
>>> 
>>>             We have seen pdf uploads to Resources creates file type as
>>>         text/url,
>>>             instead of application/pdf, which is causing the users not
>>>         able to open
>>>             the file..
>>> 
>>>             We upgraded our Sakai Kernel to 1.1.9 and running 2.7.x
>>>         branch.. This is
>>>             happening mostly on a MAC with Firefox.
>>> 
>>>             Anyone seen this or any pointer for JIRA?
>>> 
>>>             Thanks
>>>             Omer
>>>             _______________________________________________
>>>             sakai-dev mailing list
>>>         sakai-dev at collab.sakaiproject.org
>>>         <mailto:sakai-dev at collab.sakaiproject.org>
>>>         <mailto:sakai-dev at collab.sakaiproject.org
>>>         <mailto:sakai-dev at collab.sakaiproject.org>>
>>> 
>>>         http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>>> 
>>>             TO UNSUBSCRIBE: send email to
>>>         sakai-dev-unsubscribe at collab.sakaiproject.org
>>>         <mailto:sakai-dev-unsubscribe at collab.sakaiproject.org>
>>>         <mailto:sakai-dev-unsubscribe at collab.sakaiproject.org
>>>         <mailto:sakai-dev-unsubscribe at collab.sakaiproject.org>>  with a
>>>             subject of "unsubscribe"
>>> 
>>> 
>>> 
>>> 
>>> 
>> _______________________________________________
>> sakai-dev mailing list
>> sakai-dev at collab.sakaiproject.org
>> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>> 
>> TO UNSUBSCRIBE: send email to sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of "unsubscribe"
>> 
>> !DSPAM:2294,4c7fddbb185366261963365!
>> 
>> 
> _______________________________________________
> sakai-dev mailing list
> sakai-dev at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
> 
> TO UNSUBSCRIBE: send email to sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of "unsubscribe"


More information about the sakai-dev mailing list