Jonge jongens dating consolidating government loan student

For gender, the system checks the profile for about 150 common male and 150 common female first names, as well as for gender related words, such as father, mother, wife and husband.

The general quality of the assignment is unknown, but in the (for this purpose) rather unrepresentative sample of users we considered for our own gender assignment corpus (see below), we find that about 44% of the users are assigned a gender, which is correct in about 87% of the cases.

Another system that predicts the gender for Dutch Twitter users is Tweet Genie ( that one can provide with a Twitter user name, after which the gender and age are estimated, based on the user s last 200 tweets.

For each blogger, metadata is present, including the blogger s self-provided gender, age, industry and astrological sign. The creators themselves used it for various classification tasks, including gender recognition (Koppel et al. The men, on the other hand, seem to be more interested in computers, leading to important content words like software and game, and correspondingly more determiners and prepositions.

One gets the impression that gender recognition is more sociological than linguistic, showing what women and men were blogging about back in A later study (Goswami et al.

Gender recognition has also already been applied to Tweets. (2010) examined various traits of authors from India tweeting in English, combining character N-grams and sociolinguistic features like manner of laughing, honorifics, and smiley use.

With lexical N-grams, they reached an accuracy of 67.7%, which the combination with the sociolinguistic features increased to 72.33%. (2011) attempted to recognize gender in tweets from a whole set of languages, using word and character N-grams as features for machine learning with Support Vector Machines (SVM), Naive Bayes and Balanced Winnow2.

2009) managed to increase the gender recognition quality to 89.2%, using sentence length, 35 non-dictionary words, and 52 slang words.

The authors do not report the set of slang words, but the non-dictionary words appear to be more related to style than to content, showing that purely linguistic behaviour can contribute information for gender recognition as well.

For our experiment, we selected 600 authors for whom we were able to determine with a high degree of certainty a) that they were human individuals and b) what gender they were.

We then experimented with several author profiling techniques, namely Support Vector Regression (as provided by LIBSVM; (Chang and Lin 2011)), Linguistic Profiling (LP; (van Halteren 2004)), and Ti MBL (Daelemans et al.

Then we describe our experimental data and the evaluation method (Section 3), after which we proceed to describe the various author profiling strategies that we investigated (Section 4). Gender Recognition Gender recognition is a subtask in the general field of authorship recognition and profiling, which has reached maturity in the last decades(for an overview, see e.g. Even so, there are circumstances where outright recognition is not an option, but where one must be content with profiling, i.e.

Tags: , ,