LingMA update, 1-15-2010
Per my plan for January, today is the day I'd planned to have several EFR segmentation files ready for use in testing the project software. Here's what I've gathered:
Raw data files (sentence alignment files, XCES format)
XSLT
Segfiles
Tools
Reference
- Project prospectus (Last updated 27 Aug 2009)
- OPUS OpenSubtitles project
- English-French bitext data
The segmentation files aren't quite in their final form yet, due to some limitations with XSLT transformations. I'll be posting the gory details of the differences, and writing a JavaScript parser to bridge them, coming later today on the Code section of my site.
Labels: school
2 Comments:
I totally don't get any of what was wrote I this post, but good for you! We want you to geet done with this masters thing so you can come visit!:)
Heh.. yeah, unlike /code and /MUGEN, my root blog ends up being all things to all people, so more often than not it's nothing to anyone but me.
School has to get done this summer, or it won't get done at all. Deadlines have this great way of forcing you to focus on things. ;)
Post a Comment
<< Home