Migrating from 0.4 CVS?

Share your ideas for future releases of nntp//rss.

Migrating from 0.4 CVS?

Postby peterj on Tue Oct 26, 2004 5:00 am

I've been using the 0.4 CVS (with minor changes) and an HSql database. Is there any way to migrate this config to 0.5/Derby and preserve the article database?

FYI, the changes I made are:

    - I hate prefilling URL fields with "http://", so I removed that.
    - I've found an Atom feed that has
    Code: Select all
    <content type="text/html">
    instead of
    Code: Select all
    <content type="application/xhtml+xml">
    so I made AtomParser handle that condition.
    - I turn off caching by adding a
    Code: Select all
    Pragma: no-cache
    header.
peterj
 
Posts: 9
Joined: Tue Oct 21, 2003 5:11 pm

Postby jason on Tue Oct 26, 2004 1:38 pm

Peter,

First thing to try - just unzip v0.5 into a new directory and, before starting nntp//rss for the first time, copy the existing hsqldb datafiles (nntprssdb.*) from your existing nntp//rss install into that directory.

When nntp//rss starts up, it should discover the existing hsqldb database, and start migrating the article database into Derby. Note that if you have a lot of articles this process could take some time. If you're running on Windows, I recommend running nntp//rss from the command shell (java -jar nntprss-start.jar) for the first time. The migration code logs progress information during the process.

If you could email me the diffs for the changes you made, I'll look to get those wrapped into the next version.

Jason
jason
Site Admin
 
Posts: 114
Joined: Sat May 03, 2003 10:44 pm
Location: West Orange, NJ

Postby peterj on Tue Oct 26, 2004 4:09 pm

Copying only the nntprssdb.* files I get an EOFException (which is then propagated up) while reading nntprssdb.script:

Code: Select all
11:51:10,228 [main] INFO  Main - Starting nntp//rss v0.5-beta-1
11:51:10,975 [main] INFO  ChannelDAO - Initializing JDBC, connection string = jdbc:derby:nntprssdb;create=true
11:51:26,909 [main] INFO  DerbyChannelDAO - Creating application database tables11:51:28,167 [main] INFO  DerbyChannelDAO - Finished creating application database tables
java.io.EOFException
        at org.hsqldb.DatabaseFile.readInteger(Unknown Source)
        at org.hsqldb.Cache.makeRow(Unknown Source)
        at org.hsqldb.Cache.getRow(Unknown Source)
        at org.hsqldb.Table.setIndexRoots(Unknown Source)
        at org.hsqldb.Table.setIndexRoots(Unknown Source)
        at org.hsqldb.Database.processSet(Unknown Source)
        at org.hsqldb.Database.execute(Unknown Source)
        at org.hsqldb.Log.runScript(Unknown Source)
        at org.hsqldb.Log.open(Unknown Source)
        at org.hsqldb.Database$Logger.openLog(Unknown Source)
        at org.hsqldb.Database.open(Unknown Source)
        at org.hsqldb.Database.<init>(Unknown Source)
        at org.hsqldb.jdbcConnection.openStandalone(Unknown Source)
        at org.hsqldb.jdbcConnection.<init>(Unknown Source)
        at org.hsqldb.jdbcDriver.connect(Unknown Source)
        at java.sql.DriverManager.getConnection(DriverManager.java:525)
        at java.sql.DriverManager.getConnection(DriverManager.java:171)
        at org.methodize.nntprss.feed.db.ChannelDAO.migrateHsql(ChannelDAO.java:195)
        at org.methodize.nntprss.feed.db.DerbyChannelDAO.populateInitialChannels(DerbyChannelDAO.java:214)
        at org.methodize.nntprss.feed.db.JdbcChannelDAO.initialize(JdbcChannelDAO.java:506)
        at org.methodize.nntprss.db.DBManager.configure(DBManager.java:56)
        at org.methodize.nntprss.Main.startNntpRss(Main.java:132)
        at org.methodize.nntprss.Main.main(Main.java:201)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.methodize.nntprss.Startup.run(Startup.java:116)
        at org.methodize.nntprss.Startup.main(Startup.java:74)
11:51:31,585 [main] ERROR ChannelDAO - Exception thrown when trying to migrate hsqldb
java.sql.SQLException: File input/output error: File input/output error: reading: java.io.EOFException in statement [SET TABLE CHANNELS INDEX '12344464 118744']        at org.hsqldb.Trace.getError(Unknown Source)
...snip...
11:51:31,660 [main] ERROR Main - Exception thrown during startup
java.lang.RuntimeException: Exception throws whent rying to migrate hsqldb File
input/output error: File input/output error: reading: java.io.EOFException in statement [SET TABLE CHANNELS INDEX '12344464 118744']
        at org.methodize.nntprss.feed.db.ChannelDAO.migrateHsql(ChannelDAO.java:405)
...snip...
java.lang.RuntimeException: Exception throws whent rying to migrate hsqldb File
input/output error: File input/output error: reading: java.io.EOFException in statement [SET TABLE CHANNELS INDEX '12344464 118744']
        at org.methodize.nntprss.feed.db.ChannelDAO.migrateHsql(ChannelDAO.java:405)
peterj
 
Posts: 9
Joined: Tue Oct 21, 2003 5:11 pm

Postby peterj on Tue Oct 26, 2004 4:13 pm

And the diffs...

Code: Select all
Index: src/org/methodize/nntprss/admin/AdminServlet.java
===================================================================
RCS file: /cvsroot/nntprss/nntprss/src/org/methodize/nntprss/admin/AdminServlet.java,v
retrieving revision 1.14
diff -r1.14 AdminServlet.java
2215c2215
<                 "<tr><td class='row1' align='right'>Feed URL:</td><td class='row2' ><input type='text' name='url' size='64' value='http://'><br><i>(nntp//rss
supports both RSS and ATOM feeds)</i></td></tr>");
---
>                 "<tr><td class='row1' align='right'>Feed URL:</td><td class='row2' ><input type='text' name='url' size='64'><br><i>(nntp//rss supports both RSS and ATOM feeds)</i></td></tr>");
Index: src/org/methodize/nntprss/feed/Channel.java
===================================================================
RCS file: /cvsroot/nntprss/nntprss/src/org/methodize/nntprss/feed/Channel.java,vretrieving revision 1.12
diff -r1.12 Channel.java
249a250
>                     method.setRequestHeader("Pragma", "no-cache");
Index: src/org/methodize/nntprss/feed/parser/AtomParser.java
===================================================================
RCS file: /cvsroot/nntprss/nntprss/src/org/methodize/nntprss/feed/parser/AtomParser.java,v
retrieving revision 1.7
diff -r1.7 AtomParser.java
368c368
<             if (type.startsWith("application/xhtml")
---
>             if ((type.startsWith("application/xhtml")||type.startsWith("text/html"))
peterj
 
Posts: 9
Joined: Tue Oct 21, 2003 5:11 pm

Postby jason on Tue Oct 26, 2004 6:05 pm

peterj wrote:Copying only the nntprssdb.* files I get an EOFException (which is then propagated up) while reading nntprssdb.script:


Are you still able to use the same data files with nntp//rss v0.3? If so, I wonder whether this could be a memory issue. By default, the Java VM starts with a maximum memory size of 128M. The exception is being thrown by hsqldb when it is initializing the database. Is it possible that the script file may have been truncated in some way?

Try starting a fresh copy of nntp//rss v0.5 (with the hsqldb files copied over) with the following command:

java -Xmx256m -jar nntprss-start.jar

... and thanks for the diffs.
jason
Site Admin
 
Posts: 114
Joined: Sat May 03, 2003 10:44 pm
Location: West Orange, NJ

Postby peterj on Tue Oct 26, 2004 8:21 pm

I get the same error trying to use the db under 0.3, but all's well in 0.4; ditto for increasing the memory (to 256m, 512m and even 1024m). I'm using the JdbmChannelDAO class in nntprss-config.xml rather than HsqlDbChannelDAO. (My 0.4 version was built from CVS on July 13, in case that makes any difference.)

BTW, I have nntprssdb.data, nntprssdb.properties and nntprssdb.script, but the actual hsqldb database is called nntprss.db:

Code: Select all
> ls -la nntprssdb.* nntprss.db
-rw-r--r--    1 pjanes   pjanes   20471808 Oct 26 15:41 nntprss.db
-rw-rw-r--    1 pjanes   pjanes          0 Oct 26 15:43 nntprssdb.data
-rw-rw-r--    1 pjanes   pjanes        343 Oct 26 15:43 nntprssdb.properties
-rw-r--r--    1 pjanes   pjanes       4747 Oct 26 15:41 nntprssdb.script
peterj
 
Posts: 9
Joined: Tue Oct 21, 2003 5:11 pm

Postby jason on Tue Oct 26, 2004 9:13 pm

Ah - ok. You're using jdbm as the persistence store. I'm working on implementing jdbm -> Derby migration support for those users of the 0.4 beta. I should have something available within the next day or two.

The error was caused by nntp//rss trying to migrate the old hsqldb files (nntprssdb.*), which it seems are corrupt. The jdbm file (nntprss.db) is the one that will be migrated when I've finished the aforementioned module.

Hang in there - I should be able to get you going soon.
jason
Site Admin
 
Posts: 114
Joined: Sat May 03, 2003 10:44 pm
Location: West Orange, NJ

Postby peterj on Tue Oct 26, 2004 9:53 pm

D'oh! I was reading "jdbm" as "jdbc"... thanks for pointing out my mistake.
peterj
 
Posts: 9
Joined: Tue Oct 21, 2003 5:11 pm

Postby jason on Thu Oct 28, 2004 2:26 am

Here's a first attempt at the jdbm to Derby migration logic. This has been successfully tested with a few nntp//rss jdbm databases, so you may want to give it a try in your environment.

1. Unzip a copy of the standard v0.5-beta-1 release into a directory.
2. Download the updated nntprss.jar from http://www.methodize.org/nntprss/patch/ ... ntprss.jar and copy it over the existing one in directory used in step 1.
3. Copy your jdbm data file (nntprss.db) from your nntp//rss v0.4 directory into the new v0.5 directory.
4. Start up nntp//rss - if you're on Windows, I recommend doing it the first time from within the command shell (java -jar nntprss-start.jar), as the migration process will log progress information to the console.

Once the migration process is complete, nntp//rss will start-up as normal.

The only limitation of this migration tool is that it will not migrate categories. It will, however, migrate your channel configuration and the article database. Note that if you have a large database and a slower computer this process may take some time.
jason
Site Admin
 
Posts: 114
Joined: Sat May 03, 2003 10:44 pm
Location: West Orange, NJ

Postby peterj on Thu Oct 28, 2004 3:05 am

The migration looks to be okay, sans categories as you noted. I've got some subscriptions that have more than 1000 posts stored... impressive speed migrating them.

Interestingly, even after restarting post-migration the Java process is about 20MB larger than the 0.4 process, 272MB vs. 252MB. (I'm running in Linux, btw.)

I have found that I get "can't declare any more prefixes in this context" from about 1/3 of my existing feeds. I believe I've seen that this is a problem with Java 1.5 (specifically the version of crimson that's included with the JDK) although it doesn't happen with 0.4-cvs. Will investigate and see what I can come up with.
peterj
 
Posts: 9
Joined: Tue Oct 21, 2003 5:11 pm

Postby jason on Thu Oct 28, 2004 11:30 am

It seems like there is a slight memory overhead with using Derby over jdbm - this is really the trade-off of using a JDBC-compliant relational datastore over a lower level object persistence engine. I decided to go with Derby as I thought it would offer greatly flexibility in the future, as well as making development easier with regard to supporting both Derby and MySQL (core codebase is the same, with some minor differences.)

That said, I will take a look and see if there is any way in which I can optimize the memory profile.

Regarding the 'can't declare any more prefixes in this context' - I did a quick google on that topic, and it does seem like it is a potential issue with the SAX parser within 0.5 related to XML documents with a number of namespace declarations. I did come across this document in the Apache bugzilla - http://issues.apache.org/bugzilla/show_bug.cgi?id=30258. I'm wondering if you could try using the most up-to-date Xerces JARs to override those in the 1.5 release. Anyway, if you have any findings, I'd be interested to hear.
jason
Site Admin
 
Posts: 114
Joined: Sat May 03, 2003 10:44 pm
Location: West Orange, NJ

Postby peterj on Thu Oct 28, 2004 1:00 pm

Sorry, tried to respond last night but phpBB was reporting an error... the "can't declare any more prefixes" error is gone. I hadn't upgraded my home machine to the release of 1.5.0; it still had beta 2.

I thought I saw a message go by on the mailing list that Derby had a lower memory overhead, which is the only reason I noted the size. Must have misread.

Other than that, things look good. I'll hold on to my 0.4 database for a while to test further betas, should there be any; quite pleased with this one though!
peterj
 
Posts: 9
Joined: Tue Oct 21, 2003 5:11 pm

Postby jason on Fri Oct 29, 2004 12:38 am

My hosting provider does have occasional (or, recently, more than occasional) outages - sorry about that.

Glad to hear the release version of Java 1.5.0 solved the problems. Regarding the memory usage of Derby - you did see a message, however I was comparing its behavior to hsqldb, as used in v0.3 and below. jdbm (v0.4) had a slightly leaner profile, but didn't expose relational database semantics.

The neat thing about Derby is that it would make it very easy to support DB2. Though, for now, I think Derby and MySQL offer two good alternatives for desktop and workgroup/enterprise deployments.
jason
Site Admin
 
Posts: 114
Joined: Sat May 03, 2003 10:44 pm
Location: West Orange, NJ


Return to Feature Requests

Who is online

Users browsing this forum: No registered users and 1 guest

cron