[Building Sakai] PDF uploads to Resources

Omer Piperdi omer at rice.edu
Thu Sep 2 10:47:22 PDT 2010


I modified the query a little.. Here is what I came up with..

SELECT a.resource_id, a.file_path, a.xml,
        a.resource_uuid, a.binary_entity
   FROM content_resource a
   where upper(a.resource_id) like upper('%.pdf%')
   and upper(a.resource_id) not like upper('%http:%')
   and a.xml is null
   and dbms_lob.getlength(a.binary_entity) < 2000
   and 
utl_raw.cast_to_varchar2(hextoraw(dbms_lob.substr(a.binary_entity))) 
like '%text/url%'

Thanks
Omer

On 9/2/2010 12:23 PM, Omer Piperdi wrote:
>
> The query is good for the time being to see how many people having this
> issue.. (But I saw "ORA-06502: PL/SQL: numeric or value error: raw
> variable length too long", when I ran the query for all pdfs)
>
> We also have pdfs uploaded as application/binary as well.. But it prompt
> to choose a program to open at least.. text/url is
> throwing 404.
>
> Thanks again,
> Omer
>
> On 9/2/2010 11:54 AM, Matthew Jones wrote:
>> Well there are queries you can run to *see*, but there's no straight SQL
>> you can run to modify it. You'd have to write some code to do it.
>> Unfortunately this data is encoded in a binary field (rather than text),
>> which makes it faster to process but not possible to modify. This is in
>> BINARY_ENTITY in CONTENT_RESOURCE. I wrote about this last February [1]
>> and provided a query for Oracle. I don't know what it would be for
>> Mysql. You can't run a string replace on this field because the length
>> of each element is encoded within the string (with unprintable
>> characters) and if anything is changed to be longer/shorter it will
>> break when it reads it back. So you'd actually need to read
>> to de-serialize it with code and write it back. The java that does this
>> is in this link in DbContentService.java if you wanted to try, or it
>> could probably also be done in some scripted language.
>>
>> An example of binary to text for a pdf file is below . . . Unfortunately
>> for this file it thinks that this was uploaded as application/binary as
>> well, instead of pdf.
>>
>> CHSBRE
>> B/group/b11a03c0-b1a5-40d2-8617-63ad4aa968e9e/*LIS Tracking
>> Sheet.pdf*)org.sakaiproject.content.types.fileUpload inherited����
>> d e'http://purl.org/dc/elements/1.1/creatore DAV:getlastmodified
>> 20100324000103169e DAV:get*contenttype application/binary*e
>> SAKAI:content_priority
>> 2e'http://purl.org/dc/elements/1.1/subjecte)http://purl.org/dc/elements/1.1/publishere!http://purl.org/dc/terms/abstracte+http://purl.org/dc/elements/1.1/alternativee
>> CHEF:copyrightchoice I hold copyright.e
>> CHEF:modifiedby$05d1fgf55-5qaw-4340-8f25-214a7e332097e!http://purl.org/dc/terms/audiencee
>> DAV: . . .
>>
>> [1]
>> http://collab.sakaiproject..org/pipermail/sakai-dev/2010-February/005709.html
>> <http://collab.sakaiproject.org/pipermail/sakai-dev/2010-February/005709.html>
>>
>> On Thu, Sep 2, 2010 at 11:39 AM, Omer Piperdi<omer at rice.edu
>> <mailto:omer at rice.edu>>  wrote:
>>
>>      Is there a query that I can run against content_resource table and
>>      see if resource_id has .pdf in it and resource type is not
>>      application/pdf.
>>
>>      Which column has file type info?
>>
>>      Thanks
>>      Omer
>>
>>
>>      On 9/1/2010 5:07 PM, Matthew Jones wrote:
>>
>>          Yea, it's a bug with firefox on some platforms, there is
>>          currently no
>>          fix for Sakai.
>>
>>          There was a jira proposed (KNL-101) to use a file type detection
>>          library
>>          (like mime-util). However it *looked* like it involved changing some
>>          api's in the kernel, and I haven't finished fixing it yet. It's
>>          hopefully get to it to looking at it again before the 2.8
>>          freeze, but
>>          have a number of higher local priorities before then. :(
>>
>>          -Matthew
>>
>>          On Wed, Sep 1, 2010 at 5:54 PM, Omer Piperdi<omer at rice.edu
>>          <mailto:omer at rice.edu>
>>          <mailto:omer at rice.edu<mailto:omer at rice.edu>>>  wrote:
>>
>>              We have seen pdf uploads to Resources creates file type as
>>          text/url,
>>              instead of application/pdf, which is causing the users not
>>          able to open
>>              the file..
>>
>>              We upgraded our Sakai Kernel to 1.1.9 and running 2.7.x
>>          branch.. This is
>>              happening mostly on a MAC with Firefox.
>>
>>              Anyone seen this or any pointer for JIRA?
>>
>>              Thanks
>>              Omer
>>              _______________________________________________
>>              sakai-dev mailing list
>>          sakai-dev at collab.sakaiproject.org
>>          <mailto:sakai-dev at collab.sakaiproject.org>
>>          <mailto:sakai-dev at collab.sakaiproject.org
>>          <mailto:sakai-dev at collab.sakaiproject.org>>
>>
>>          http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>>
>>              TO UNSUBSCRIBE: send email to
>>          sakai-dev-unsubscribe at collab.sakaiproject.org
>>          <mailto:sakai-dev-unsubscribe at collab.sakaiproject.org>
>>          <mailto:sakai-dev-unsubscribe at collab.sakaiproject.org
>>          <mailto:sakai-dev-unsubscribe at collab.sakaiproject.org>>  with a
>>              subject of "unsubscribe"
>>
>>
>>
>>
>>
> _______________________________________________
> sakai-dev mailing list
> sakai-dev at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>
> TO UNSUBSCRIBE: send email to sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of "unsubscribe"
>
> !DSPAM:2294,4c7fddbb185366261963365!
>
>


More information about the sakai-dev mailing list