Internet traffic classification is vital to the areas of network operation and management. Traditionalclassification methods such as port mapping and payload analysis are becoming increasingly difficult asnewly emerged applications (e.g. Peer-to-Peer) using dynamic port numbers, masquerading techniquesand encryption to avoid detection. This paper presents a machine learning (ML) based traffic classificationscheme, which offers solutions to a variety of network activities and provides a platform of performanceevaluation for the classifiers. The impact of dataset size, feature selection, number of applicationtypes and ML algorithm selection on classification performance is analyzed and demonstrated by the followingexperiments: (1) The genetic algorithm based feature selection can dramatically reduce the costwithout diminishing classification accuracy. (2) The chosen ML algorithms can achieve high classificationaccuracy. Particularly, REPTree and C4.5 outperform the other ML algorithms when computational complexityand accuracy are both taken into account. (3) Larger dataset and fewer application types wouldresult in better classification accuracy. Finally, early detection with only several initial packets is proposedfor real-time network activity and it is proved to be feasible according to the preliminary results.