[Building Sakai] truncating to fit field

Matthew Jones matthew at longsight.com
Fri Feb 28 08:05:34 PST 2014


Well it sounds like there's three possible sets

1) An existing database where it was setup properly with
NLS_LENGTH_SEMANTICS [1] set prior to creation of the Sakai tables or
converted sometime in between. - No action needed!
  - Hopefully this is the majority of cases
2) An existing database where this parameter wasn't set and the VARCHAR2's
are BYTES
  - Changing all the SQL files would have no effect here, they'd need some
kind of conversion script.
      Either converting everything. (
http://bluefrog-oracle.blogspot.com/2011/09/script-submitted-to-otn-to-altter.html
)
      Or converting selective fields. (Something we'd provide?)
3) Someone starting out fresh on Oracle
  - While it seems less schools are coming up on Oracle at this point, we
could
    - Put in a check (similar to the content check) advising that they
should run run NLS_LENGTH_SEMANTICS or they'll have to run a conversion
down the road.
    - Change the legacy SQL scripts, but as Chuck mentioned we might miss
some, future ones might come up and be incorrect still and some might
really intend to be BYTE.  It looks like ~200 changes in kernel alone.

Some issues with #3, some of the files are actually defined already in the
script *as* byte. So it's probable that not everything should be CHAR.
./mailarchive/mailarchive-impl/impl/src/sql/oracle/sakai_mailarchive.sql:
    CHANNEL_ID           VARCHAR2(99 BYTE) NOT NULL,

After looking and thinking about this. I feel like if you notice a problem
with certain select fields (like group name) then that should be a bug and
a conversion script created to convert that specific field. Seth has filed
issues similar to this in the past (
https://jira.sakaiproject.org/browse/SAK-20122). I don't think we should
completely change everything over in all the legacy SQL files.
NLS_LENGTH_SEMANTICS seems like a good recommendation for a new
installation though, but if it's a shared instance that might not be
possible. I'm also not sure about setting it up as part of the session and
it's not something easy for me to test (or something that has very high
time value).

[1] http://wiki.loopback.org/index.php/Setting_NLS_LENGTH_SEMANTICS



On Fri, Feb 28, 2014 at 10:17 AM, Charles Hedrick <hedrick at rutgers.edu>wrote:

> Actually, I'd like to track down why it's base to set
> NLS_LENGTH_SEMANITICS. My concern with changing all the declarations is
> that the next time someone creates a new table they may forget, and we'll
> end up with some columns on way and some another. That's why the Oracle
> document I quoted recommended using NLS_LENGth_SEMANTICS. It may well be
> that different Oracle documents make contradictory recommendations here.
> I've also seen a document that recommended not setting it globally, but
> setting it for specific sessions. That seems weird since you different
> sessions accessing the same data with different length semantics would seem
> to be a recipe for disaster.
>
> On Feb 28, 2014, at 10:14 AM, Charles Hedrick <hedrick at rutgers.edu> wrote:
>
> > It seems to me that the best approach would be to find all the .sql
> files and add "char" to all the definitions. That sounds like something to
> do for Sakai 11.
> >
> > On Feb 28, 2014, at 10:10 AM, Brian Jones <bjones86 at uwo.ca> wrote:
> >
> >> We run with Oracle DB's at Western and we've had problems with this in
> the
> >> past. Many, many columns are created as VARCHAR2 (XXX), where XXX is a
> >> number of BYTEs rather than CHARs.
> >>
> >> From the documentation we've read, and according to our DBA, it is not
> >> recommended to set NLS_LENGTH_SEMANTICS=char.
> >>
> >> I think it would be better if the auto.ddl scripts created the columns
> >> explicitly as VARCHAR2 (XXX CHAR).
> >>
> >> Brian Jones
> >> Programmer/Analyst
> >> Information Technology Services
> >> Support Services Building, Room 4326
> >> Western University
> >> (519) 661-2111 x86969
> >> bjones86 at uwo.ca
> >>
> >> From: sakai-dev-bounces at collab.sakaiproject.org
> >> [mailto:sakai-dev-bounces at collab.sakaiproject.org] On Behalf Of Sam
> >> Ottenhoff
> >> Sent: Friday, February 28, 2014 9:43 AM
> >> To: Charles Hedrick
> >> Cc: sakai-dev
> >> Subject: Re: [Building Sakai] truncating to fit field
> >>
> >> We already have code in Sakai to check whether the database connection
> can
> >> read/write proper UTF-8.  I would be fine with adding another check for
> >> Oracle semantics, but it doesn't make much sense to me to rewrite all
> insert
> >> string truncation code to deal with an Oracle db that isn't multi-byte
> >> preferred.
> >>
> >> On Fri, Feb 28, 2014 at 8:21 AM, Charles Hedrick <hedrick at rutgers.edu>
> >> wrote:
> >>>
> >>> The Oracle web page recommends not using "char" in the declaration, but
> >> instead setting NLS_LENGTH_SEMANTICS to char. However unless I have some
> >> assurance that sakai sitss actually set NLS_LENGTH_SEMANTICS, I need to
> >> assume the worst.
> >>>
> >>> On Feb 27, 2014, at 5:12 PM, Sam Ottenhoff <ottenhoff at longsight.com>
> >> wrote:
> >>>
> >>>> Adding code to work around this seems wrong.  Isn't the issue that
> your
> >> database was setup with a non-multibyte character set which means that
> the
> >> length calculations Oracle is making are based on one byte = one
> character?
> >> Our confluence page
> >> (https://confluence.sakaiproject.org/display/I18N/Common+UTF-8+Problems
> )
> >> recommends "AL32UTF8".
> >>>>
> >>>> Can any of our Oracle-using institutions chime in here?  If the
> database
> >> is setup initially with a multi-byte character set, are the tables
> created
> >> correctly where the length of SAKAI_SITE_GROUP/TITLE is 99 multi-byte
> >> characters and not 99 bytes?
> >>>>
> >>>> --Sam
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Feb 27, 2014 at 4:49 PM, Charles Hedrick <hedrick at rutgers.edu
> >
> >> wrote:
> >>>> my new code uses this: https://gist.github.com/lpar/1031951, except
> for
> >> mysql, where a substring will do (because it interprets limits as
> >> characters, not bytes).
> >>>>
> >>>> Yuck.
> >>>>
> >>>> This is LSNBLDR-352.
> >>>>
> >>>> On Feb 27, 2014, at 3:54 PM, Steve Swinsburg <
> steve.swinsburg at gmail.com>
> >> wrote:
> >>>>
> >>>>> The field is declared wrong, if it's going to store utf8 but use
> >> substrb or substrc to trim the string in bytes or utf8 chars.
> >>>>>
> >>>>> Cheers
> >>>>> Steve
> >>>>>
> >>>>> sent from my mobile device
> >>>>>
> >>>>> On 28/02/2014 3:29 AM, "Charles Hedrick" <hedrick at rutgers.edu>
> wrote:
> >>>>> I am generating a group name. The maximum length is 99 characters. In
> >> mysql it appears that truncating to 99 characters is fine, because mysql
> >> interprets field lengths in characters.
> >>>>>
> >>>>> However in Oracle, unless the declaration uses "char", it is
> understood
> >> to be bytes. That means that UTF-8 characters can take more than one
> space.
> >> Is there some standard method in Sakai to truncate a string to a length
> that
> >> will fit into a given database field?
> >>>>>
> >>>>> Unfortunately, the SAKAI_SITE_GROUP is defined in oracle using field
> >> names without "char" after them.
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> sakai-dev mailing list
> >>>>> sakai-dev at collab.sakaiproject.org
> >>>>> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
> >>>>>
> >>>>> TO UNSUBSCRIBE: send email to
> >> sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
> >> "unsubscribe"
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> sakai-dev mailing list
> >>>> sakai-dev at collab.sakaiproject.org
> >>>> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
> >>>>
> >>>> TO UNSUBSCRIBE: send email to
> >> sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
> >> "unsubscribe"
> >>>>
> >>>
> >>
> >
>
> _______________________________________________
> sakai-dev mailing list
> sakai-dev at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>
> TO UNSUBSCRIBE: send email to
> sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
> "unsubscribe"
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai-dev/attachments/20140228/455ec167/attachment.html 


More information about the sakai-dev mailing list