Friday, December 21, 2012

Making life easy at mysql command line

I was annoyed with these same problems especially paging query results  at mysql command line or grepping within results or recording queries for future purpose. For all those debugging productions issues related to mysql  here is an excellent blog from perconna to make your life easy 

Wednesday, December 19, 2012

CGlib enhancer and finalize method

Yesterday was a bad day because one of the node after updating to tomcat7 ran into full garbage collection. I took a heap dump and finally found that one of the enhanced class had 300K references hanging around in finalizer thread. I was  enhancing SimpelJDBCTemplate to find out time taken by each query and log it. The reason this happened because CGLib also enhanced the finalize method and in the method interceptor I was  delegating the call to the delegate object which was not existing at that time. Anyways the solution was to skip the enhancing of finalize by adding a callback filter and nooping on it.

As you can see in below code the filter returns 0 if finalize method is called and that means use the callback at 0th position in the callbacks which is a NoOp callback and for all others it uses 1st position which is my real code. 

The simple code fix was

    public static SimpleJdbcTemplate createPerfInterceptedTemplate(Class callerClass, DataSource dataSource) {
        final SimpleJdbcTemplate simpleJdbcTemplate = new SimpleJdbcTemplate(dataSource);
        final String className = callerClass.getSimpleName();
        Enhancer enhancer = new Enhancer();
        enhancer.setCallbacks(new Callback[]{NoOp.INSTANCE, new MethodInterceptor() {
            public Object intercept(Object obj, Method method, Object[] args, MethodProxy proxy) throws Throwable {
       your real delegate code here..........               

private static final CallbackFilter FINALIZE_FILTER = new CallbackFilter() {
        public int accept(Method method) {
            if (method.getName().equals("finalize") &&
                method.getParameterTypes().length == 0 &&
                method.getReturnType() == Void.TYPE) {
                return 0;
            return 1;

Tomcat7 session fixation and session listener

Ran into an interesting issue. We use flash uploader in our web ui for browsers that dont support html5 and flash has a problem that each request it makes to server has a new session because it wont send any cookies back to server. The only way to circumvent around this is to send original sessionId as a post parameter and on server cache all the sessions in tomcat memory and then join this new session to the original session using sessionid coming in post.

Anyways long story short we updated to tomcat7 and suddenly one of our feature that allows us to impersonate a user broke.  Finally nailed it to  a security fix in tomcat7 that will renew sessionId on basic authentication but the issue is that to do flash based file upload we relied on HttpSessionListener.sessionCreated to cache all sessions by sessionId.  And when  tomcat7 was renewing sessionId  it was not calling  the sessionCreated event for the new session. There were two ways to solve it:

1) Disable session fixation security fix by adding  the below valve to context.xml . I did not chose to do this as it would make tomcat less secure.

<Valve className="org.apache.catalina.authenticator.BasicAuthenticator"  changeSessionIdOnAuthentication="false"/>

2) Extract the code in HttpSessionListener.sessionCreated to a common method and call it manually during impersonation.  I chose this way for now as its more secure.  In future when tomcat fixes this bug I will remove this code.

Thursday, December 13, 2012

Biggest relief from NOSQL to SQL migration

This year my biggest accomplishment was to move our old NOSQL system from BDB/Cassandra to Mysql and so far its holding billions of rows and working fine.  Though this move has given me and my buddy peace and good sleep and I can now focus on other fires. But the biggest relief comes from being able to delegate some tasks to junior team and also being able to quickly script adhoc requirements quickly.

For e.g. today I got an adhoc requirement to find list of customers with > 1000 versions of a single file.  Had it been BDB I would have to write a program and then run it on each app node to find the answer and it would have taken days to get this info.  But with mysql all I had to do was to write a script that will execute a federated query and get me the output so all I need to do is run something like

nohup python "select pid, max(cnt) from (select customerid,file_id,count(version_id) cnt from \${SCHEMA_NAME}.version_\${TBL_SUFFIX} group by customer_id,file_id having count(version_id) >1000)a group by customerid" &

and within 6 hours the script had sweepeed all the shards and got me the result. So I had to just do some grep/awk/sed to get the answer.

Few weeks back I got some adhoc requirement to find max size of file uploaded by each customer and same I got it in less than one day whereas earlier I would have spent days getting this info.

Also one more advantage I see is that I have created a read only user on each mysql slave and  given access to production support people and as everyone knows relational database very few requirements come to me so I get more free time to work on more interesting things.

Also last night mysql server crashed twice while creating index on a 85M record table and boy the DBAs knew how to recover it whereas when it was BDB it was an Achilles hill to recover data out of it and this time Ops team did it without involving everyone and they were all confident and relaxed and BDB/Cassandra was always panic mode.

Got first mysql table with 84M records

wow this is the first time I had scaled a database with 84M records in one table. Though I didn't expected it to grow this big but a weird customer behaviour where he had 17K versions of a file and he moved it 5K times. 17K version * 5K move=85M events generated in one shard in one table.

Within a month these will get purged as we retain last one month event, so table size will go down but it feels good that the system is behaving nice after adding the missing index on this 85M table, today its again sleeping like a baby.

A copy paste mistake can bring down a server

we recently migrated our eventstore on Mysql and I did a small boo boo.  I had two tables event and event_details and I had created two indexes on it

CREATE INDEX events_${TBL_SUFFIX}_i2 ON events_${TBL_SUFFIX} (event_detail_event_guid, pid);

CREATE INDEX event_details_${TBL_SUFFIX}_i1 ON events_${TBL_SUFFIX} (event_detail_event_guid, pid);

As you can notice the boo boo in second index Instead of creating it on event_details table I created the same index on event table :(.

Yesterday night a shard event_detail table balloned to 85M records and there were 20 threads doing a full table scan on this table. 

so I fixed the copy paste mistake and generated ddls for each shard but the mysql server kept going Out of memory everytime it tried creating index on this 85M record table.  Ultimately the only way to get it done was to start mysql on diff port so no one will connect  to it and then give innodb more memory and create index, reset innodb settings back and then restart the server.

All in all it took 6 hours in night to fix the after effects of this copy paste mistake.