Monday, August 29, 2011

Java and lsof

Update: I found that using Runtime.exec was a bad idea because if you have a 2GB VM footprint then the forked process would require 2G free memory in order to run the lsof command.  We had earlier writte a simple python http rpc server that would allow us to execute native commands(like creating a hardlink or running gunzip) from Java and I changed this code to delagate to RPC call few days back.
So the new code looks like



 public void writeTopCommandOutput(Writer writer) throws IOException {
  String rpcRes = Util.doCommandRpc(rpcUrl, "", ListUtil.create("top", "-n", "2", "-b", "-d", "0.2"));
  writer.write(rpcRes);
 }

 public void writeLsofOutput(Writer writer) throws IOException {
  String pid = getJvmProcessid();
  if (pid != null) {
   pid = pid.trim();
   String rpcRes = Util.doCommandRpc(rpcUrl, "", ListUtil.create("lsof", "-p", pid));
   writer.write(rpcRes);
  }
 }


 --------------------------------------
I was chasing a customer issue and some threads in threaddump showed that they are stuck doing filer I/O. Now I needed to chase what file paths these threads are accessing as few days ago this customer had expereinced same issue. So I asked operations to give me lsof output and as we are a distribured team and engineers dont have access to production machines.It always takes time to chase people and to a programmer this means lots of context switches and it derails your thought process.  My goal is to eliminate as many hoops in my debugging path so I wrote a JSP to get me lsof output from java. This will be a jsp accessible through internal ips only, hurray from next release onwards one more reason to avoid Operations team in chasing issues.
Here is the method I added in jsp


    public void writeLsofOutput(Writer writer) throws IOException {
        String pid = getJvmProcessid();
        if(pid!=null){
            pid = pid.trim();
            String cmdLsof = "lsof -p " + pid;
            executeCmd(writer, cmdLsof);            
        }
    }

    private void executeCmd(Writer writer, String cmd) throws IOException {
        String output;
        Process child = Runtime.getRuntime().exec(cmd, new String []{});            
        BufferedReader stdInput = new BufferedReader(new InputStreamReader(child.getInputStream()));
        while ((output = stdInput.readLine()) != null) {
            writer.write(output + "\n");
            writer.flush();
        }
        stdInput.close();
    }
    public String getJvmProcessid() throws IOException {
        String pid = null;
        File pidFile = new File(System.getenv("CATALINA_PID"));
        if(pidFile.exists()) {
            FileInputStream fin = new FileInputStream(pidFile);
            List lines = IOUtils.readLines(fin);
            fin.close();
            pid = StringUtils.join(lines.toArray());
        }
        return pid;
    } 

Saturday, August 13, 2011

Programatically extracting quoted reply from an email

When files are uploaded to our cloud file server, we wanted to send notification email per file with its own unique email address. I will discuss how to have so many unique email address without creating a user on mail server for each file and scale out the solution in some later blog. People can just hit reply button on the generated notification email and comment on the uploaded file. When reply email reaches back the server we want to extract the comment that user added after stripping out the quoted reply form the mail client and add the clean comment to file. Seems like an easy problem isnt it, but unfortunately there is no easy way to detect the quoted reply from an incoming email because different mail clients use different way to quote a reply. On top of it quoted reply of html emails are different than plain text quoted replies.
  1. Angle Brackets "> xxx zzz"
  2. "---Original Message---"
  3. "On such-and-such day, so-and-so wrote:"
  4. html email reply in thunderbird uses blockquote tags.
  5. yahoo/hotmail uses some div tags
Got an brilliant idea from someone to add a hash marker in the outbound notification email  so that when it comes back we can strip the text after that marker. Then I found other sites are already doing this like redmine or issueburner already does that. These guys add a marker text in outbound email like below

##### Please do not write below this line #####
Hi kalpesh,

The issue has been updated.

Updated by:     Kris Katta
Comment added:     this is a test comment
Kris Katta's Reply..

This is a test reply

To track the status of your request and set up a profile for yourself, follow the link below:





Now all that is left to extract the mail header so using some regex you can strip that. I  have handled thunderbird and outlook and will soon add yahoo/hotmail. Below is some sample code.


/**
 * @author kpatel
 */
public class QuotedReplyExtractor {
    public static final String 
REPLY_MARKER = "--- Please reply ABOVE THIS LINE to comment on this file ---";

    private static final List patterns = new ArrayList();
    static {
        patterns
                .add(Pattern.compile(".*on.*?wrote:", Pattern.CASE_INSENSITIVE));
        patterns.add(Pattern.compile("-+original\\s+message-+\\s*",
                Pattern.CASE_INSENSITIVE));
    }

    public String stripQuotedReply(String comment) {
        int startIndex = comment.indexOf(REPLY_MARKER);
        if (startIndex > 0) {
            comment = comment.substring(0, startIndex);
        }
        for (Pattern pattern : patterns) {
            Matcher matcher = pattern.matcher(comment);
            if (matcher.find()) {
                startIndex = matcher.start();
                comment = comment.substring(0, startIndex);
            }
        }
        return comment;
    }

} 

Wednesday, August 10, 2011

Spring MVC and Unicode characters

We ran into an issue where some user tried entering Danish character ΓΈ in his first name and it was not updating properly in LDAP. First we thought its a LDAP issue but then I found that in some other page where we use DWR api the same character is getting updated in LDAP properly. Finally we nailed it down to Tomcat/Spring where even after doing
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
the character encoding was not properly set on request.
Adding this filter solved the issue. This has to be first filter in web.xml,
otherwise it wont work, I had it earlier as the second filter and i wasted some time debugging it
	 <filter>
		<!-- Filter to handle content encoding for UTF-8 encoding, this has to be the FIRST FILTER, do not move -->
	    <filter-name>encodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
	    <init-param>
	        <param-name>encoding</param-name>
	        <param-value>UTF-8</param-value>
	    </init-param>
	    <init-param>
	        <param-name>forceEncoding</param-name>
	        <param-value>true</param-value>
	    </init-param>
	 </filter>
	 <filter-mapping>
	    <filter-name>encodingFilter</filter-name>
	    <url-pattern>/*</url-pattern>
	 </filter-mapping> 
 
 
Technically all its doing is 
request.setCharacterEncoding("UTF-8");
response.setCharacterEncoding("UTF-8"); 
 
Ideally this should be set by tomcat as I had the below set,
but its not working somehow so setting this filter made it work. 
DWR was parsing the request body manually and thats why it was working and spring was not. 
<Connector port="8080" URIEncoding="UTF-8" ......>

mysql execute immediate

I had a requirement to generate a sharded schema where no of shards and no of tables in the shard were dynamic. Basically we wanted to shard one table so we ended up creating 8 schemas and each schema will hold 8 tables that are copy of the same schema. Now I didnt wanted to hand write the 64 table/schema creation statement so came up with this procedure that allows to dynamically build and execute sql queries.  Oracle was so easy, mysql is a little bit verbose.


    drop procedure if exists create_rdb_tables;
    delimiter #
    create procedure create_rdb_tables()
    begin
    declare v_max int unsigned default 9;
    declare v_counteri int unsigned default 1;
    declare v_counterj int unsigned default 1;
      while v_counteri < v_max do
        while v_counterj < v_max do
            set @sql_text := concat('drop table if exists metadata_rdb_schema',v_counteri, '.metadata_rdb_t', v_counterj,';');
            prepare stmt from @sql_text;
            execute stmt;
            DEALLOCATE PREPARE stmt;
    
            set @sql_text := concat('CREATE TABLE metadata_rdb_schema',v_counteri, '.metadata_rdb_t', v_counterj, ' (user_id INT NOT NULL, object_id VARCHAR(255) NOT NULL,group_id VARCHAR(36) NOT NULL,entry_id VARCHAR(36) NOT NULL,created_time DATETIME NOT NULL, primary key(object_id))TYPE=innodb;');
            prepare stmt from @sql_text;
            execute stmt;
            DEALLOCATE PREPARE stmt;
            set v_counterj=v_counterj+1;
        end while;
       set v_counteri=v_counteri+1;
      end while;
    end #
    delimiter ;
    call create_rdb_tables();
    drop procedure if exists create_rdb_tables; 



Finally it started becoming messy and kludgy so I ended up writing clean python code to generate the ddl :). This effort was a waste but I ended up learning how to do dynamic sql execution in mysql.

Tuesday, August 9, 2011

ProcessId 911



call it good luck I got process id 911 for java. :).  Hiding my laptop name to keep it anonymous.

Tuesday, August 2, 2011

Tomcat configurable session timeout per customer

We are a cloud file provider and more geared towards enterprise customers. We have a default session timeout of 6 hours for web ui access and recently customers had a requirement that they wanted to configure a session timeout themselves. As we host multiple customers on one node, this was an interesting requirement and we were discussing all sorts of hacks until I landed on to this api HttpSession.setMaxInactiveInterval.

So now all we need to do is upon successful login, check if admin has overridden session timeout settings for this enterprise and set that on session using the above api.

Monday, August 1, 2011

Tomcat printing catalina pid

If you are hosting more then one tomcats on a physical box in production then lot of times you might want to see the process id of running instance. We dump jstacks/top command output in a folder every 5 minutes and this helps in correlating it. Here is a sample code to dump tomcat pid.

 public String getJvmProcessid() throws IOException {
  String pid = null;
  File pidFile = new File(System.getenv("CATALINA_PID"));
  if(pidFile.exists()) {
   FileInputStream fin = new FileInputStream(pidFile);
   List lines = IOUtils.readLines(fin);
   fin.close();
   pid = StringUtils.join(lines.toArray());
  }
  return pid;
 }