[Building Sakai] Playing with the Sakai Mailing List Data

Alan Berg bergsmooth at gmail.com
Mon Apr 8 00:44:05 PDT 2013


Hi Chuck,

You can do a nice Social Network Analysis of the e-mails and compare it to
commit history. Hopefully, the email address can act as a UID.

Regards,

Alan


On 8 April 2013 04:53, Charles Severance <csev at umich.edu> wrote:

> As part of the SI301 - Networked Computing class that I am teaching at
> UMich this semester, I am using the entire Sakai dev lst from 2005 to give
> them a significant amount of  data to chew on and visualize.  Here is the
> README for all the code I have written so far:
>
> https://github.com/csev/networks-code/blob/master/gmane/README.txt
>
> It consists of a gmane spider (gmane.py) - a data cleanup / indexing step
> (gmodel.py) and then a data analysis phase (gbasic.py) - there will be more
> data analysis programs written by students to do different data analysis
> and visualization of the data over the next few weeks.  If you read the
> above README, cleaning up the data is pretty tricky and prone to error so I
> figured I would give you all access to the data and let you see if it needs
> more cleaning before I loosed my students on the data:
>
> Here is the cleaned data:
>
> http://www-personal.umich.edu/~csev/sakai/email/index.sqlite
>
> You can use the Firefox SQLiteManager plugin to look at the data.  The
> data model is pretty obvious and the message headers and bodies are
> compressed with Python's zlib.  The first and very simple data analysis
> program to read this file is:
>
> http://www-personal.umich.edu/~csev/sakai/email/gbasic.py
>
> The entire workflow is described in the github repo.   But don't spider
> your own copy of the data - otherwise gmane might lose their patience.  If
> folks are interested I will upload the raw data  (content.sqlite for Sakai
> is 635 MB - 10X the properly modeled, cleaned up and compressed data in
> index.sqlite).
>
> We can do a lot of cool analysis like text analysis or average reply speed
> or average number of replies - I will see what the students want to do as
> their projects once I whip up a few cool visualizations to get them started.
>
> Let me know if you see an error or your school or you is improperly
> represented in the data.  I tried as best I could to map people to their
> most recent email address if they have had more then one email address over
> the life of the mailing list.
>
> /Chuck
>
> P.S. Here is some of the output of the program:
>
> Top 40 Email list organizations
> gmail.com 7339
> umich.edu 6243
> uct.ac.za 2451
> indiana.edu 2258
> unicon.net 2055
> tfd.co.uk 1591
> berkeley.edu 1384
> longsight.com 1347
> stanford.edu 1266
> ox.ac.uk 1193
> ucdavis.edu 1175
> rsmart.com 1063
> cam.ac.uk 1035
> etudes.org 866
> gatech.edu 857
> rutgers.edu 758
> columbia.edu 700
> virginia.edu 644
> earthlink.net 606
> mtu.edu 585
> mac.com 563
> ufp.pt 475
> rice.edu 442
> uva.nl 421
> yale.edu 407
> sakaifoundation.org 321
> csu.edu.au 284
> uhi.ac.uk 261
> yahoo.com 259
> upvnet.upv.es 250
> hotmail.com 249
> upmc.fr 248
> threecanoes.com 245
> unisa.ac.za 241
> serensoft.com 238
> ufp.edu.pt 237
> aeroplanesoftware.com 235
> unavarra.es 227
> ucmerced.edu 223
> loi.nl 220
>
> Top 40 Email list participants
> steve.swinsburg at gmail.com 2657
> azeckoski at unicon.net 1742
> ieb at tfd.co.uk 1591
> csev at umich.edu 1304
> david.horwitz at uct.ac.za 1184
> stephen.marquard at uct.ac.za 853
> arwhyte at umich.edu 782
> matthew at longsight.com 701
> adam.marshall at ox.ac.uk 699
> jimeng at umich.edu 698
> slt at columbia.edu 686
> clay.fenlason at gatech.edu 670
> adrian.r.fish at gmail.com 612
> markjnorton at earthlink.net 605
> chmaurer at indiana.edu 601
> swgithen at mtu.edu 585
> knoop at umich.edu 571
> hedrick at rutgers.edu 565
> ggolden22 at mac.com 560
> sinou at etudes.org 527
> ray at berkeley.edu 491
> bkirschn at umich.edu 489
> tpamsler at ucdavis.edu 485
> lance at indiana.edu 479
> botimer at umich.edu 461
> jholtzman at berkeley.edu 452
> jleasia at umich.edu 451
> zqian at umich.edu 435
> matthew.buckett at ox.ac.uk 434
> caseyd1 at stanford.edu 431
> nuno at ufp.pt 429
> ktsao at stanford.edu 429
> ajpoland at indiana.edu 405
> dlhaines at umich.edu 377
> mmmay at indiana.edu 356
> a.m.berg at uva.nl 337
> dave.ross at gmail.com 330
> ottenhoff at longsight.com 328
> john.bush at rsmart.com 325
> jpgorrono at ucdavis.edu 298
>
>
>
>
> _______________________________________________
> sakai-dev mailing list
> sakai-dev at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>
> TO UNSUBSCRIBE: send email to
> sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
> "unsubscribe"
>



-- 
Regards,
       Alan

Alan Berg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai-dev/attachments/20130408/26818222/attachment.html 


More information about the sakai-dev mailing list