Tuesday, October 16, 2007

Are Viruses and Botnets Same – Not Really

Have you heard about Virus? Let me put the term properly, “Computer Virus”. Most probably, if you are running a licensed Windows operating system and most probably, you will be running a anti-virus software from well known vendor which comes to you for free while you purchase the computer (desktop or laptop). What is the purpose of the AV software? It helps you to defend from Virus and Worms. Don’t they help you to protect you from Viruses? Yes, AV will help you to safeguard your computer from viruses provided you update the AV software quite often. Is that an enough protection? Not really.

Think about a weird computer connected to Internet through his broadband and scrapping his head and orkut with no reasons. Suddenly, he gets a sweet email and within seconds he installs software which is tiny. As soon as the software was installed, CPU utilization goes up momentarily. What could be the reason?

The weird computer user is a victim of a “Botnet”. What is Botnet and is it another fancy term? Mind it…It may seem to be a fancy term with a devastating character. As soon as they get installed, they spread the bad news quite faster. Botnet are nothing but tiny software that is installed in a system (Windows, these days) and take control of your network. How it does that?

It’s simple. Botnets does one thing and does it fine. As soon as the tiny software is installed, an IRC connection is made to a malicious IRC Server. This malicious IRC server has hell a lot of features. All the exploits will be there as tiny modules and the modules get downloaded to the box based on the vulnerability. The beauty is that the clients keep on downloading the malicious module and shortly after that the way is over.

So next time, we an email comes to you, be sure that you really want to install the software. Prevention is better than cure. No AV to date can cease the activity of Botnets. Only solution is to reimage the system. But feel free to reimage the systems even if you are 1% sure that you are infected with Botnets. I am not exaggerating as they effects are devastating. In short Botnets are not viruses, but they are Virus of Viruses…

Monday, September 24, 2007

How to Escape from Phishing

Just few minutes back, I received an email from a popular bank in America about locking of my banking account and requested me to re-login to activate it. I was very happy by looking at the email because of the humor. The simple reason for my laugh is that “I do not have an account with any bank in America. Neither I had one”.

Being interested in Security (these days, I am much inclined towards web application security), I could readily understand that it was phishing. Luckily, it got into my SPAM folder and that positively confirms that it was phishing. I clicked the link and I could see the exact replica of the original site. This is was my impression at the first sight but after carefully watching it for 2 minutes, I can notably see minute differences between the fake and original web site.

For me it was funny as I did not have an account and so I was able to come to a conclusion. Think about people who have an account and the email was delivered to account holders. If the users are not security literate, this can possibly lead to monetary losses. What one needs to do when such an email comes in.

  1. First, take the email and do not read in a hurry
  2. Spend few minutes to read and re-read, re-read, re-read carefully.
  3. If you are good in English (any language) grammatically and syntactically, you will find hell a lot of mistakes. This is enough to confirm phishing as banks never make these silly mistakes in simple English. Also you can find a lot of punctuation errors. This is common mistakes one can find in fake sites.
  4. Check the origin email account. Usually, the banks will send emails from their domain name.
  5. Follow the link and check the address bar. Verify the website. It should resemble your bank website. But you will find mistakes.
  6. The email will also have a sense of urgency. For example, take action in next 24 hours.
  7. Once you doubt an email, notify the bank (just forward the email you received).
  8. Login to the bank account by typing the bank url (if you need) and not by clicking a link in the email. You can notify your friends, a social service
The following snapshot is a phishing mail. Check out for errors.

After Two Days
It is in fact, a phishing. I confirmed it after two days the site was blocked and the server was down. The following is the snapshot I took 2 minutes back.

Saturday, September 22, 2007

STEP Auto - Another STEP in my career to cherish

It was my third experience presenting a paper in an International Conference and the Second STEP Auto. I should say, this time they made a tremendous job and improved a lot when compared the one happened early this year. I happen to witness a keynote address and few best practice papers. The keynote address was fine and best practices papers are equally good. But some of the papers delve deep into test management which are not so relevant to me at this point of time, so I choose to skip those sessions.

Our talk was scheduled as last slot in the best practices but does that mean the audiences were in the mood to leave for the day. Not absolutely. The audiences were wonderful with sparkle in the eyes, eager and welcoming thoughts from all the speakers. We were just preparing to make our speech in a different way and at the same time we wanted to put our thoughts on Web Application Security. It was our undoubted thought that anything is a character and done with a Passion becomes an Art. Especially, the web application security as there is no silver bullets and attacking them is pretty easy. Our thoughts are mainly focused on the words Art, Passion and Character and the rest of the technical details revolved around these foundations.

We had a sweet surprise to us even before our talk. The shock was that the last two talks had to be done in 20 minutes instead of allocated 35 minutes. The person who presented before us literally found hard to put his thought and he was asking for more time (literally). It is not fair in the part of the conference people to get the time from the speakers. So we had some time (20 minutes) to think and we did not speak with each other. But we had clear plan which was not trying to cover entire slides but covering few slides in depth.

We stepped in and as expected, we were requested (asked) to complete the talk in 20 minutes. We assured that we would stick with time (we tried to be gentlemen.. but really aren’t). My friend started off the presentation and he progressed through slides. I didn’t see the watch but he would have taken 12-15 minutes. He talked about web application evolution, threat classification, “panic and patch” and patch management process. I took over and talked on Security in SDLC followed by penetration testing. Finally, we wrapped up the talk with a “Take away” and “What it takes to follow”.

I wasn’t aware about the response from the audiences. Here and there we had an unusual (usual) pun. We spoke for 35 minutes and we got a nice comment from the conference chairman that people would like our talk even if we speak for 45 minutes. What a comment? I was craving and aiming for comments like those and made public speaking, a passion. We do receive similar response from one of the participants. Overall, it was great feeling. Technically, I have a long way to go and this is just a starting.

I aspire to write similar blogs in future quite consistently.

Friday, September 14, 2007

Precious book on Java – Effective Java by Joshua Bloch

Before getting into some useful reviews, I would like to write something about myself 4 years back. I was an ordinary Engineer (even now) and learning Networking. I said to myself, I would never like Java in my life. But due to various (???) reasons, I was forced to work in Java. My initial days with Java were terrible and I was stumbling like any other newbie. But slowly, I was made to like Java. It took me two years to write some code with a lot of passion. I should honestly say that it is due to this great book “Effective Java”. This is absolutely not an over rating and the book deserves much more credits as it changed my perspective on Java.

By Java, I mean Core Java. This book made to understand the elegance of Java and its strong APIs. While reading I often refer Java Libraries written by the author of this book. Each of his words has a meaning to it. The one two three four chapters I like most are Threads, Exceptions, Object Creation and Deletion and of course Classes and Interfaces. Though, the other chapters are equally good, I particularly like these four chapters because they are cornerstone to Java. The author has given a lot of Best Practices and if you apply those best practices, your code is much maintainable, readable and comprehensible.

Threads offer you a greater flexibility but writing thread safe application is harder. The current day applications have seriously bugs and if they are running properly, it is nothing more than mere coincidence. The book also explains the results of over synchronization and wait/notify. The chapter on exceptions is more fulfilling and it gives two great thoughts – Exception Chaining and Exception Translation which is handy when your application has many layers. The book gives thoughts on object creation, object deletion, classes and interfaces.

This is a right book for you to have a copy of this book if you are really interested in writing code effectively. This book helps you to think Java in an art from.

Sunday, September 9, 2007

Favor Composition (“has a”) over Inheritance (“is a”)

!!! Composition and Inheritance should complement each other !!!

The important key words of any object oriented programmer is “is a” and “has a”. These two key words relate an object with another object. For example, in a real world a “Human” and “Man” is a relationship. “Human” and “Hands” have “has a” relationship. In short, “Man” is a “Human” and “Human” has hands. Inheritance is a great tool that helps to define hierarchy and model concepts as real world objects. It also reduces greater amount of code through code reusability. But in normal scenario, composition just helps up to model objects as they are. For example, Earth has continents, Continents have countries, countries have states, states have cities/towns/villages and goes on. Here the composition is typically used for the relationship “has”. Traditionally, when one wants to implement a function, a method will be added. But composition can be used in an extraordinary ways to bring in dynamic behavior in the system. The composition makes the software flexible and it gives different dimension to object oriented programming. Let us quickly get into some action with an example.

You need to implement different types of sorting algorithm. But there are many sorting available and you should implement bunch of them. Based on the client requirement, you need use any one of the sorting algorithm (when there is low memory, you need to go for insertion sort but if the memory is high you can go for quick or merge sort). The bottom line is the client knows which algorithm is needed and your framework has to do the job. If the memory is low, the clients decide to go with insertion sort and the framework needs to use the sorting algorithm. Also, if the client is interested, the client should be able to fit in their own algorithm “weird sorting” into your framework. How will you go about with this problem? How will you design classes?

There are two ways of solving this problem. The first way is very crudest way where you have all the sorting algorithms implemented in a same class. The single class will have methods – binarySort, insertionSort, heapSort and so on. This straight away blow up the design principle – open close principle. For adding up new sorting “weird sorting”, you need add a new “weird” method. It produces a maintenance nightmare. The second way is slightly smarter way where in the sorting algorithms are implemented as class for each class inheriting from a class “Sorting” which is abstract. But the clients have to use them based on their requirement and most importantly they cannot change the sorting algorithm dynamically.

The third approach to this problem is implement an abstract class or an interface “Sorting” that has a method “sort”. Each sorting algorithm implements this “sort” method and as the result you have many sorting algorithm. When you want add “weird sorting”, it is as simple as to add new class implementing the “sort” method. Your code follows open close principle and this avoids a lot of testing. You can for sure say that your new code does not introduce a bug in the old code. So far, we talked about inheritance. This is usual stuff.

How will you allow others to invoke the sorting method? You need to extend each of the sorting class and so that others invoke the “sort method”. But this method leads to class explosion. When a new sorting is implemented, you need to change/add code. But instead of doing this, you can have the sorting algorithm as a component with a “has a relationship”. For example,

public class Client {
private Sorting sortingAlgorithm;

public void setSortingAlgorithm(Sorting sortingAlgorithm) {
this.sortingAlgorithm = sortingAlgorithm
}

public void someOperation() {

sortingAlgorithm.sort(); //first

/// some operation

sortingAlgorithm.sort(); //second
}

Consider the method someOperation() of Client class. Also assume that Client is a shared object and so many people decide on the particular sorting algorithm. Now the sorting algorithm can be changed dynamically based on various factors. If your application has memory management module, it can play its part to decide on the particular sorting algorithm. In the above example, the first method could be a different sorting and the second sorting could be a different sorting. This, what we mean by flexibility.

In order to engage people in using composition, most developers argue the words “Favor Composition over Inheritance”. These words are simply phrased to give you the power of composition. These words should not be taken literally and no composition works greatly without employing inheritance. So both “Composition”and “Inheritance” should complement each other in a true object oriented perspective. It is time to etch

!!! Composition and Inheritance should complement each other !!!

Open for Extension and Closed for Modification

This is one of the fundamentally design thought that every designers should deeply analysis. When we say open, it does not necessarily carry the real meaning of “open” and close does not imply the real meaning of “close”. In the context of designing, these two words are related to level of flexibility given to programmers/developers. This fundamentally principle suggests that an application design should be flexible enough to accommodate features to the application.

But a sudden spike of thought that comes to our mind – for every product release, we add a lot of features. Before proceeding further, let us ask some questions.

  1. Does your product have added few features for each release?
  2. Do you feel that the base architecture remain fairly stable over many releases?
  3. Do the developers know the consequences to their features when the underlying framework changes?
  4. How easy or tough adding of features are? (Easy or tough is very relative J )
  5. How this fundamental design principle helps the framework to be flexible?

There will be two scenarios where the developers will modify the code. When the software is buggy, they do not have any other way but to modify the code. The developers typically identify the root cause of the bug and fix them. But some of the developers open the code to add new features. The modification of the existing code for adding new features conveys that there is something wrong. It is a bad design for a framework if it forces it clients to modify the code when adding new features. Apart from forcing the modifications, the design should not force itself to undergo a massive code change to support new functionalities. When it comes to framework, the designers should also see the framework from another perspective apart from providing basic functionalities. The perspective is that the framework is just a contract or guidelines given to the clients. The bottom line is the framework is just contract that governs and facilitates proper functioning of the applications. When it comes to an application, the application should foresee some changes in requirements and burn flexibilities in the design to accommodate those requirements.

The design principle, “Open for Extension; Closed for Modification” is a great tool and thought for building software. This is just a thought and for achieving this one needs a lot of things to be done and known. In few of my next blogs, I ll be writing on the design principles and these design principles could be a one-liner, a design pattern that is commonly needed or concepts taken from Java library.

Wednesday, September 5, 2007

Improving Test Coverage using Code Coverage

For the past few years, the industry is undergoing a lot of advancements. Particularly, new software development model have been followed. The industry moved from waterfall model towards Agile and Iternation (incremental) model. Also, a lot of progress have been made in improving productivity by employing tools to aid development. One of the most important activity is Unit testing. Unit testing is essential to ensure that the developed code works as expected and without the unit testing the development is not complete. Unit testing has to be carried on the entire code not leaving a single line of code. It is during this phase, the developers make sure that the code will work properly as expected. There are few questions that we need to ask ourselves during unit testing
  1. Are we doing it like a black box testing without looking at the source code?
  2. Do we look at the source code during unit testing?
  3. How do we make sure that the unit testing cover maximum source code?
Let us straight away get into the answers to the above questions. To answer the first question, the unit testing is not a black box testing. Doing it like a black box testing is not effective way of testing the software at unit level. For example, in the actual source code there could be lots of conditional and branching statements and loop statements. The black box testing does not test these language specific constructs properly but it focuses only on functionality of the software.

The answer to the second question is that every developer has to look at the source code while writing a unit test plan. This is very crucial and the test cases has to be based on the source code. The developer of the source code is the best person to know about the source code and hence it is advisable and desirable that the author of the software carry out the unit testing.

Before answering the third question, let us discuss on testing effectiveness. The effectiveness of testing is the covering the entire source code with minimal test cases. When the effectiveness is high, the quality and productivity of the software will also be high. As human are prove to make mistakes, it is often essential to use tools to improve effectiveness. Code coverage is one such technique to do unit testing effectively. Code coverage pin points the area of the source code that is not tested. Most of the code coverage tools work from package level to source line number level. Once the unit testing session is over, the developer can immediately see the results and improve the test plan to make it more effective.

For example, during iteration #1, the developer will execute the test cases and run code coverage in parallel. During the unit testing or after the unit testing, the developer can take a look at the code coverage reports. Since the code coverage tool reports the coverage at line level, it is easier for the developer to add test cases then and there. Most of the tools have features to generate a comprehensive reports at desired level (application, package, class, method and line level).

There are few code coverage tools (open source and commercial) available for Java. Emma and Cobertura are the most popular code coverage tools in open source arena. Both of them are much matured and used by many developers across globe. There are lot of tutorials available for both the tools. Kindly refer respective websites for more information.

Tuesday, September 4, 2007

Software Engineering Practices and Tools – Now or Never

Recently, I was browsing through Google Video. There was an interesting presentation on Static Analysis. The speaker was a researcher and talked on the importance of Static Analysis. I have been using few open source tools like Findbugs, PMD, Checkstyle and Cobertura. I should confess honestly that I have not been using them by heart but as a process. These are more a sort of personal process. I should also admit that I did not get a real understanding of the tools usage. I am not going to blame myself because we did not live in the world of true parallelism. A couple of years back, a true parallelism is above layman’s reach. Only the high end users and top enterprises use true parallelism.

There were two parallelisms possible in the past. They were super scalability and instruction pipelining. I lost tracking the advancements in hardware industry as the growth was tremendous (I too do not have competency to track the developments). Until recently, we were living in the age of true fastness. In 1990 a C program might have run in 10 nanoseconds but in the year 2000 the same C program might have run in 5 nanoseconds. No multiprocessing or multithreading. The reason is the hardware manufacturers were able to achieve fastness in clock frequency. Simply, there were able to execute more instructions per second sequentially. Nowadays, in each physical processor we have many logical processors. Each logical processor runs in parallel. The process of sequential execution is fading away. We are forced to learn multithreading to tap the advantage of multi-cores. As human we are not so much used to concurrency and that’s our limitation.

In older days, the computers are meant for geeks. But Java, Web 2.0 and Web technologies gave a lead to computers. With these infrastructures, now a layman can explore the power of Internet. We cannot imagine a day in this planet without the Internet – Mails, Blogging, Community Software and, Messengers. The world has become a virtual family. A computer Engineer will handle Software in a different way than a layman. The software that is being developed should be easy to use and reliable. How can you achieve reliability? How will you study functionality in detail without tools?

With wide deployment of Software, Software security is gaining momentum. It was “ok” to leave vulnerabilities in the past but today within 15 minutes of your software release, the applications are being hacked. A couple of years back, a Honeypot was deployed in the Internet. Within 15 minutes, an attacker took over the honeypot. But the Honeypot was protected enough so that the attacker was locked inside. So writing secured software is going to get harder and harder.

In order address the issues from all sides, the fundamental characters have to strong. Software Engineering practices and following the practices by heart is need for the hour. Use of tools helps us to uncover most of the low hanging issues that may go undetected in the final product. Matured Software Engineering practices together with tools can improve the quality and productivity. You will release software with fewer bugs in lesser duration. In the future, big companies are going to survive. But the companies which follow the practices by heart are going to become big companies.

Your managers are not responsible if you do not follow processes or use tools. Now it is time for a paradigm shift. Now or Never.

Saturday, September 1, 2007

Profiling Tools for Java Applications

“An apprentice carpenter may want only a hammer and saw, but a master craftsman employs many precision tools”

Tools are primarily used for two reasons – Quality and Productivity. It helps us to drill down the problem. When it comes to Java, one has a lot of tools both open source and commercial. There are lots of development tools available for Java and you can find consolidated information at http://java-source.net/. This blog discusses profiling Java applications and gives guidelines on when and where to profile.

Believe me, profiling has to be considered as last resort. It is just like debugging a bug in your application. While coding, the developers should concentrate on addressing the requirements rather than concentration on the performance. The development team chooses the technology, protocols and algorithms that perform the job effectively. Profiling should be done selectively and on need basis. Never profile the application to improve the overall performance of the application. If you want to improve the performance of the whole application, start with design document. Evaluate the algorithms, data structures and infrastructure.

Profiling has to be done only on critical paths. Only 20% of the code is used by the users 80% of the time and the rest of 80% of code is used only 20% of time. It is enough to profile that 20% of code that is used 80% of time. Once you have decided to profile the application (even for fun), you need to have a tool that gives you reliable data. Though you can always settle for printing the time in each method entry and exit, most of the times you will have access the source code and you might be able to add print statements because of various reasons. Profiling tools become handy and once again for Java, you have many open source profiling tools. Netbeans profiler is a great tool to profile the application for fastness, memory consumption. It can profile entire application or part of the application. You can even profile a single statement and get to know their performance cost. You can find the Netbeans IDE and Profiler at http://www.netbeans.org

I have been using Netbeans and Netbeans profiler for the past one year and I am quite satisfied with its feature and results. It integrates nicely will any application and that is the crux. Before using Netbeans, I tried to use few open source profilers but I spent a lot time to find how to use the tool. But with Netbeans, I bet, you will take off within 10 minutes. After installing Netbeans profiler, it is enough to spend few minutes in going through the “Profiler” menus and you can happily explore its functionality in your free time.

CoW – Linux way of creating processes

Linux operating systems is one of THE most popular operating systems and continue to lead the embedded operating systems markets. In this blog, I would like to give an overview of a design decision in Linux and it is one of the reasons why I see Linux not only as Software but also as an Art. The kernel developers thought leadership is unquestionable. I am great fan of Linux Kernel DevelopersJ. Without bogging you down, let me come to the point directly.

UNIX operating systems create processes using the system call fork () and overlay (load) a binary/executable image using exec () system call. The system call fork () does not have a parameter and the system call exec () few parameters to load a program from a permanent storage (File System). Practically, the system calls fork () and exec () are twins; fork () is the elder brother and exec () is the younger one. Let us first see the functionality of “fork”. As you may know a process is program in execution that has a state such as data, heap, stack, pending signals, open files and environment variables. When you want to execute something, let us say running a command “ls” from the shell, the “shell” process typically calls fork (). When the fork () is successful, the kernel creates another process which is copy of the process. So each successful call to fork () returns twice – once in the called process (aka parent process) and second time in the newly created process (aka child process). As a standard, all the operating systems copy the address space of parent process and create another address space. There is a overhead involved while creating fork’ ing a process.

Shortly after the process is created, either parent or child process is loaded with some other program. Typically, one of the system calls in “exec” family of system calls is used. When “exec” is executed, the entire address space of the called process is recreated. So there is considerable amount time spent in this “double creation”. Surely, there is some kind of optimization can be done to gain substantial performance.

In Linux, fork () does not copy the address space but just simply creates the kernel data structures needed to the new process. Now, both the parent and child process uses the same address space and the entire address space is marked read-only. So, both the parent and child process can continue to read the process address. When any one process tries to write to a page (memory page), the kernel duplicates the address space and creates the address space. But this is unlikely to occur. Another scenario is the loading a new program into memory. When this occurs, the kernel any way creates a fresh address space and starts to execute the loaded program. By this approach, the “double creation” is avoided. The deferring the duplication of address space has given a performance and sometimes procrastination helpsJ.

The functionality of fork () is called CoW – Copy on Write. So from next time, when you see a Linux box and a running process think about

But you have two process after fork (), which process will be scheduled first? Yes, Linux is a masterpiece.

Spirit of Open Source

Having worked in a software company for the past five years, now I am able to appreciate the spirit of open source. When I mean open source, I don’t necessarily mean free software. Open Source is much more than free software. Free does not carry any meaning in terms of monetary benefits (though Open Source provides profitability). Free implies to the freedom of using it, modifying it and of course helping others by redistributing it. Though these are somehow enforced through open source licenses, there is one thing that is not enforced but followed by heart by the community. By community, I mean all the developers and users of any open source Software.

If you are in a computer geek or a software engineer, you might be used at least one OSS. More and more vendors are moving towards open source to capture their market or to make their products better. When they come in, they advertise that they are for open source. But once they become stable, they try to stand on top of the spirit of open source. Yes, they see open source legally. In short run, they may gain popularity but that is mirage and they are quite satisfied with the mirage as it bears more fruits than they expected.

But there many people who are totally vendors unbiased and develop open source with noble thoughts. These people understand, respect and nurture open source. They propagate and advocate open source. They release so many versions of their software under open source licenses. The most popular operating system, Linux, is one of the best examples.

Security - Now the programmers panorama

The days of access list, VPNs, IDS/IPS and Firewalls are gone. Dont get me wrong. Those are still great technologies to protect you assets but the world now moves towards another cycle. It goes to the place where it started. Thanks to Web 2.0 adoption. People collaborate using Internet for many things. Just like this blog. Web application deployment is marching much faster than the expectation and almost we are in the verge of IP Address depletion. Without Internet, the world may stop for a while (and every software engineer need to relearn problem solving and need to take an elective on how to work without search engines. Some will end up doing a PHD on this)

Web applications, a little door to a mighty businesses, now gaining attention from attackers. It is not only due to value of the asset or amount of profitability. It is very very simple to attack a web application. I have recently went through couple of books on Web Application security. Though I did not go through it in detail, the methods and tools are simple to use and you need to be a geek to do all the fancy stuff.

Oh God. Some of the web application security forums say "90% of web sites have vulnerabilities". It is true to a major extent. For the past two weeks, I have been trying to find a web site that is doing one thing, yes it is just one thing better. I am taking about Input Validation. If you need a single toolkit to safeguard you blog, orkut, bank account just try to find whether the input validation is done properly. It is the worst culprit than CSS, SQL Injection and authentication.

There are few great books on Software Security and I particularly enjoyed reading the book "Web Application Hacking Exposed". You may need to check amazon reviews before buying one and investing time. After reading the book, you find that the best way to defend against attackers is to write a solid code, to follow software engineering best practices, to do code review, to run static analysis, to do pen.test. Sometimes, you ll also feel like hacking your application to keep attackers under your toe.

Yes, it is your feeling, action and passion makes a better software and not the tools. Tools just help you to achieve your destiny fast.

Wednesday, August 29, 2007

OS Fingerprinting - Most Fulfilling Talk

Today, Rajkumar and I gave a talk on Operating System Fingerprinting. Rajkumar, the main speaker of the talk started and talked about security mind set. He narrated various reconnaissance attacks. He then explained about OS Fingerprinting and active fingerprinting. During his speech he talked about TCP/IP implementation differences in many popular and infamous operating systems like Windows XP, Linux and Solaris. He also demonstrated active OS fingerprinting with NMAP.

The second part of the session, I started with an overview on Buffer overflow to emphasis why OS Fingerprinting is essential. Then explained POSFP with a real incident took place in this world. We also discussed positive aspects of OSFP like Network Auditing.

That was one of our most fulfilling talks in Network Security. The audience were great and totally it was wonderful feeling.

Tuesday, August 28, 2007

Security - Art, Passion and Character

Having Worked in Network Security for three years, I got addicted to attacks and VD. Though not a veteran, I would proudly say that I am Security Enthusiast. For studying or mastering security one needs to know internals of how things work. Security is not a technology or a silver bullet. In a broader perspective, security is way of life, place that cannot be achieved in this Information/Internet era.

Whenever, I get a chance to get connected with security aspects of life, I tuned myself. Rajkumar (colleague of mine) and I were browsing through some websites. Most of the websites have logic bombs in them. As far as a web application is concerned, it needs to take one thing seriously. Linux geeks used to say “Do onething. Do it well” and it greatly applies to Web. For web application, the quote should be “Do validation properly. Do it well”. Coming to back those web applications, they did something fundamentally wrong. We never tried XSS, SQL Injection and stuff like that. We simply played with input fields. Havoc…

Based on that we came out with a paper that talks about web application security. Personally, we don’t prefer to preach others. This paper is just a guideline for make better software. The time has come to build security in the product. No more, the security is a plug-in.

We will be presenting our paper in “Step Auto”, an International Conference on Software Testing, Process and Automation. More in the conference.

Monday, July 2, 2007

Experiences With Netbeans

I have been using Netbeans IDE (5.0 and 5.5) for the past one year and my experiences great. Prior to using Netbeans, I was using Eclipse in my high-end desktop. Yes, I had 2 GB RAM and 2.x GHZ processor. When I start my applications and Eclipse, I faced a lot of difficulties in running both of them. I do not find fault with Eclipse, but it is sheer nature of my application that caused the problem.

For experimental sake, I installed Netbeans and download Netbeans happened quite quickly and within 30 minutes, I was able to install Netbeans and set up the project. Since I needed to set a debugging session, my task was simple. Just compile the sources and start the Tomcat in debug mode. I was happy to connect to remote tomcat (in the same host) and completed the debugging. Since I used to debug applications using Eclipse, the debugging with Netbeans became easy. I would give full marks to Netbeans for giving a small but efficient IDE.

Over the next few days, I was using Netbeans almost daily and that made to investigate more on the tool. Suddenly, I needed to work on a project which is a web application running on Tomcat and powered by Struts. Netbeans, The Crusader, came to my rescue. Netbeans has built-in Tomcat and feature that supports Struts. Soon, I have all configuration files ready and without any source, I built and ran the project. My test web application was launched in my default browser, Firefox. Once again, I am pushed to give full marks for integrated Tomcat. Two more stars for the feature to configure integrated Tomcat. It was a "WoW" feeling.

I don’t find any differences between Netbeans and Eclipse in other features like Refactoring, help, Views/Perspective etc. Both are equally good and I don’t have any points to support one in the basics aspects of IDE. I believe Netbeans developers are very much careful in people's need rather than features and this reflected in the integrated servers (like Tomcat, Web server) but in the case of Eclipse we need to download additional megabytes to install plug-in and that the plug-ins are version dependent.

As of now, the main disadvantage with Netbeans is that there are only few plug-ins but for Eclipse already hundreds of plug-ins are available. Netbeans community should improve a lot and fast to make life easier for the developers. Unless this is addressed, the developers will be in dilemma whether to migrate to Netbeans completely. At least, the software development tools like Code Coverage, Metrics Calculation should be the first priority.

Hope the Netbeans Community will address this soon.

Have a Great Day

Thursday, June 21, 2007

Garbage Collection - Moving Closer

In the previous blog, there was a open ended question. The question goes like this "In your JVM, you have few objects that are eligible for the garbage collection. JVM also runs garbage collector. Now the question is, whether all the objects that are eligible for GC will be garbage collected?". The answer to this question needs the deeper understanding of the various garbage collection algorithms, the size of the objects and more importantly the life duration of the objects. Believe me, JVM's GC is not one algorithm but pool of algorithms that sweeps the heap.

We had an overview of Garbage Collection from a programmers' perspective in the previous write-up. It was just an introduction and we did not discuss much from the JVM's perspective. Some of you would started ask a question, "Why should I know all these? I am just a developer". Thanks for your thoughtful question and I appreciate it. You might have high-end desktops for coding a "Hello, World" program and your application might work as expected. But an enterprise level Java application face a lot of challenges in terms of functionalities and performance. If you know the internal working of a system, you get a broader perspective and possibly lead to judicious usage of the system. Your knowledge on JVM will become handy and it will certainly pay-off in near future.

As we discussed in the previous write-ups, there are two factors that are key for successful garbage collection. The first work is to identify the garbage objects (objects that are no longer used) and the second work is to actually clean them up. Apart for cleaning up, the garbage collection algorithm also does memory organization by relocating objects in heap.

Garbage Collection - A Closer Look

In the case of Sun Hotspot JVM, the garbage collection takes place using generational collector. Basically, there are two postulates based on which the generational collector works and they are
  • Most of the objects (more than 90% of the objects) are short lived. They die as soon as they are created.
  • A smaller percentage of the objects are long lived. They live almost till the JVM is live.
Based on this two postulates, Sun Hotspot JVM has designed the garbage collection algorithm. The entire heap is sub-divided into two regions - Young generation and Old generation. The young generation is smaller in size when compared to the older generation. Initially the objects are created in the young generation and they are promoted to older generation, if they live long. JVM employs two difference garbage collection algorithm in young and old generation as they nature of the objects varies. The following diagram gives a pictorial representation of JVM heap.

Figure - 1

Wednesday, June 20, 2007

Garbage Collection – Part 1

Java is platform independent and Java Virtual Machine, the platform dependent component of Java gives this platform independency. JVM is exceptional user space process and it runs from mobile to high end servers. JVM has to bring in platform independency in Class Loading, Threading, Input/Output and Garbage Collection. Looking from the operating system's perspective Java Virtual Machine is just another process. But things are not that easy as they seem. Starting from the loading of classes to the recycling of objects, JVM behaves like an operating system by itself. In this series of blogs, we will discuss how the objects are re-cycled, various garbage collection algorithms and how to tune your JVM for better performance. Our discussions will be based on Sun JDK 1.5 (Tiger) and will be use open source tools as it is freely available in order everyone can use them.

Overview of GC

JVM is just a mighty big process (or it could be a tiny process running in a mobile phone) – that’s the operating system’s perspective. JVM internally does a lot of magic but nothing is visible to the operating system. The operating system just services the JVM. Garbage Collection and Threads are integral part of JVM. There are at least few threads runs when the JVM alive and GC is one among them. But GC is a low priority thread and JVM runs it only on need basis. JVM will try to run GC only when there is memory scarcity. But when it runs, it stalls the other threads in the JVM. The user program creates the objects and uses them as long as they want. The user program de-references it when they no longer need it and JVM takes care of deleting the unused objects executing the finalizer.

There is question that comes to our mind immediately. Who decides the eligibility for the garbage collection? Or how the objects are picked up for deletion? Though JVM automatically deletes the unwanted object but it is developers’ responsibility to say that he/she does not want the objects anymore. The developers need not tell this explicitly but JVM understands it implicitly. Among the objects in the JVM, JVM chooses few objects as special ones and name them as “Root Objects”. Usually the local references of all the stack frames (local variables of all the methods that haven’t exited), string objects in constant pool of the class and the class variables or static variables will be termed as “Root Objects”. When JVM wants to do the garbage collection, it removes the unused object from the Heap. All the objects that are reachable from any of the root objects directly or any objects that are chained with the reachable objects are the objects that are currently being used. All other objects that are not reachable from root objects either directly or through object that are linked with root objects are termed as “Unused objects”. During garbage collection, the JVM marks the unused objects and deletes them freeing up memory. But the algorithm of finding the unused objects and deleting them greatly varies from implementation to implementation. Even a single JVM implementation might have many GC algorithms which can be used based on the application and situation. Before deleting, JVM checks whether the object has finalize. If it has one, JVM postpones the freeing up of the object until finalizer is run. So the objects which have finalizer, GC is has one additional step, that is, invoking finalizer. Until then JVM does not deletes the object.

So far we discussed theoretical aspect of the garbage collection process, the rest of this section explains with an example.

Looking at the figure, the objects that are yellow are the root objects. All the objects that are chained with root objects are currently being used (that are represented in pink color). The objects that are not reachable from any root object directly or indirectly are unused objects which are represented as dotted circle. There are chances for unused object being linked with each other but still they are unused objects and eligible for the garbage collection. The point is, the objects should be reachable from the any of the existing stack frame (Each method when invoked, JVM pushes a stack frame that contains local variable, operand stack).


Eligibility for GC

In the last section, we discussed GC from 10,000 feet. In this section we are going to see Java program and identify the objects that are eligible for garbage collection at various stages of the program. By “Eligible for GC”, we should understand that the objects are only eligible for GC and we are telling JVM that we no longer need the objects. It is up to JVM to delete those objects and recycle the memory. JVM will make every possible attempt to recycle memory.


Consider the above code, at line 18, 19 and 20 we are creating objects. Assume that the control is at line 21 after executing 20. At this point of time, we have references to four objects referred by “args”, “string”, “i” and “j”. So there are four root objects. All the root objects and the objects that are reachable from the root objects are not eligible for GC. Hence at line 20, there are no objects eligible for GC.

Consider the above code, at line 18, 19 and 20 we are creating objects. Assume that the control is at line 21 after executing 20. At this point of time, we have references to four objects referred by “args”, “string”, “i” and “j”. So there are four root objects. All the root objects and the objects that are reachable from the root objects are not eligible for GC. Hence at line 20, there are no objects eligible for GC.

When JVM executes the method “display”, the control goes to the method where it has references to the objects that are passed as arguments. Apart from the arguments, the method “display” has one more local variable “string” which also becomes a root object. After executing line 32, the total root object becomes 5. At line 34 and 37, even though the object referred by “i” and “j” are de-referenced, the objects pointed by “i” and “j” cannot be garbage collected as the method “main” already has a reference. At line 40, the object referred by “string” is eligible for garbage collection. Once again at line 43, the object referred by “str” is not eligible has the method “main” has a reference to it.

When the control gets back to the method “main”, after executing line 23, the object referred by the variable “i” becomes eligible for GC. Subsequently, the objects referred by variable “j” and “string” become eligible for GC when the line 25 and 27 are executed respectively.

To summarize
  • The root object is decided not only passed on the local variable of the method being executed. JVM goes through the entire stack to find the root objects.
  • Apart from the stack, JVM also looks into “static” variable or class variables and keep them as root variables. So the static variables will be eligible for garbage collection when the class is unloaded from JVM
  • Few JVM will implement the method area in Java Heap. That is the JVM allocated memory to hold code in the heap. Those objects are also becomes eligible for GC when the classes are unloaded.

Questions for Understanding

1. What is Garbage Collection?
2. How the objects are recycled?
3. What is the standard garbage collection algorithm recommended by Java Virtual Machine Specification?
4. Elaborate the object lifecycle.
5. What are the candidates of root objects? How they affect the garbage collection?

Question for Thinking

In your JVM, you have few objects that are eligible for the garbage collection. JVM also runs garbage collector. Now the question is, whether all the objects that are eligible for GC will be garbage collected?

Answer in single word "yes" or "no".

If you are not sure on what to answer, the next blog will open some of the concepts and eventually you will answer the question.

Keep Watching and Have a Great Day

Garbage Collection - Understanding the Death

If you are C/C++ guy and migrated to Java, Java might have impressed you with Garbage Collection. Yes, that is true. Java cleans up your mess automatically but it is your responsibility to show which ones are mess. Even with Java automatic GC still memory leak is possible in Java. Refer "Effective Java" by Joshua Bloch (it is a great book on using Java effectively). Like Linux, Java is also being used from Mobile phones to High end servers. Would you believe, if I say, Java does not have just one garbage collection algorithm but it is a suite of algorithms. Above that, the interesting fact is that, Garbage Collection algorithms are tunable and configurable.

By default, it works in a specific way but it can be changed based on the application requirements during starting up of Java Virtual Machine. We will be discussing Garbage Collection techniques and GC tuning techniques with simple example. In the entire exercise, we will rely on generating GC statistics from the JVM and internalize the Garbage collection.

To understand the next few blogs (I dont know how many blogs that I will write on GC), it is assumed that you know Java. Even if you don't know Java the algorithms will be of interest if you are planning to write memory management module. I bet, you will like the way it is written.

Monday, June 18, 2007

java -Xms

Kick Starting With a Little Knowledge (java -Xms)

Java is an interpreted language and lacks performance as it runs on Java Virtual Machine. This is an age old statement and probably true during its early stages. It is quite natural for any Software or Hardware to scale up or perform well. Java is an exception as it came fast exceptionally. Today with a lot of improvements and flexibilities Java Technologies gives an edge. Java Virtual Machine plays a crucial role for these performance improvements. A newbie can get started with Java and write a decent application with greater ease. In this blog, I would like to share the information about JVM, JDK, APIs, Tools and ideas related to Java Technology. As of now, I am planning to write regularly but of course I do not know the interval. Anyways, I am positive to blog and share some of the information I find interesting. Keep watching and give back your comments.