Thursday, December 8, 2011

ARO - Doc Comment Parsing in PHP

Following the work on IoC (Inversion of Control), here I am again to present the second module of my project, ARO (At the Rate Of). ARO is a PHP Doc Comment parsing library useful for parsing the Annotations, descriptions, etc. from the Classes, Methods or Properties.

ARO uses PHP Reflection to get its job done. Well this library is again flexible and hence I thought I can release it as open source. Its hosted on GitHub (https://github.com/ashwanthkumar/aro-php). Feel free to fork or report an issue.

I tired my best to document it on the GitHub, so let me skip further details here.

Disclaimer: Reason I created this ARO is for using IoC effectively in the first place. Also there are many more advanced Doc Comment parsers in PHP. This module is just what I came up in around 5 hours.

Wednesday, December 7, 2011

IoC-PHP - My (little) Inversion of Control on PHP

After around 5 hours, I am happy to say I have completed my first module in my BlueIgnis project. It is the IoC (Inversion of Control) module. Now, I can dynamically Inject dependencies in my application as and when needed.

This is definetly not the first IoC implementation in PHP, but I just wanted to create one for my needs. After all its my project and I would like to code it from scratch (I can't believe I am saying this).

A note of caution: I am a newbie to the entire IoC thingy, I know very little or I just misunderstood the entire concept. So, I wanted to learn from the community. Hence I decided to release it as Open Source code so that people can have a look at it or may be even improve it further.

I currently don't bother much with the Licensing issues. If you like the code, or you think it can be improved further, please let me know.

For people who does not know what IoC is, have a look here.

Next step would be to achieve the Design Goals in IoC using Annotations support.

Link for the project - https://github.com/ashwanthkumar/ioc-php


BlueIgnis - Starting Finally

Finally after almost 8 months of planning, modelling, and designing process - BlueIgnis is finally taking place. Good part is that my mentor has asked me to do Oracle ThinkQuest as the final year project. I was hunting for a good topic to build. The category I am eligible is "Application Development", and when I was wondering I decided its time I spend some time for BlueIgnis (aka Social Heat).

Taking the wonderful experience from building a custom framework using simple design patterns (more on this in a later post) for Webnaplo, and from the previous TGMC project (Back To My Village) - Reformists; Past 2 days went good building the BlueIgnis architecture with more new features which I only used to dream before.

I am all excited to work on this. This time with more features packed into its design. Since I am planning to release this as a standalone app and not as SAAS (I wish I can do it), it runs on your infrastructure.

For people who does not know what BlueIgnis is, you can refer my earlier blog post which gives a gentle introduction about its functionality.

Keep watching this space for more information and updates regarding BlueIgnis.

Tuesday, November 29, 2011

Online Virtual Reverse Auctioning

Well this is a small idea for an online event for a technical symposium at our college campus. Well here goes the details of it. Feel free to post your comments on this idea.

Name of the Event - Virtual Reverse Auctioning

Event Description

  • Just like any auctions' website, this one will also hold auctions of various (virtual/random/nonsense) items. 
  • Generally in any auctions, the bidder who bids the highest gets the item with the price paid, but in Reverse Auctions, the person with the lowest bid wins the bidding (at the end of the time). 
  • To start with participant are given Rs. 100/- (negotiable) in their accounts to start off with. 
  • To maintain order in the system against all participant posting Rs. 1 for all the items, I have come up with a rule for posting the bidding amount. 
    • Least value of the bid can be only 95% of the current lowest bid value.
      Example - If the initial cost of the item is Rs. 100, and user A bids with Rs. 95/-, current cost of the item becomes now 95. User B, can now post a lowest of Rs. 90.25, highest being any value (no upper limits actually). 
    • Each bidding costs the user a Buck. (Rupee. 1 /-) 
    • Well again there is a small consideration to be made here, each bidding can be made as a pay-use or a bid on an item can be made as pay-use.
      Example - For every bid, I place on Item I can cost me Re. 1 /- or all the bids I place on an Item I can cost me Re. 1/-. I leave this for you to decide (post what you feel below as a comment)
  • Upon winning the bid, the original cost of the item is credited to the user's account. Thus the user get more money to bid more on the system. 
  • Since a limiting criteria of only 95% of the current lowest bid can be made, this keeps the participants checking in to the game often to see if anyone has made a lower bid than him/her. 
Winning Criteria - Participants with the highest cash on hand (plus number of items traded on the market, well this is again up to you. Feel free to post what you feel below as a comment) wins the game. 

General Doubts 
  1. Organizer of the event is given the Admin access to add items to the system. He can set the Name, type, initial cost, time period of auction of all the items currently enabled in the system. More about this to the developing team. 
  2. All the items will be completely virtual. It can range from a banana to iPhone - you are limited by your imagination. 
  3. System is yet to be developed. Hoping to create a team (of max 4) to build this system. Its fairly simple to do so. I would like to see second and third years volunteer in this (first years are also welcome) to make it a reality. 
  4. If you are interested in developing this idea, contact me, I will pour in more ideas.
Disclaimer: This Game plan is completely a creative idea, from Ashwanth Kumar. This was shared with a sole intention that it might be useful. If you already have a similar or the same idea, I would love to discuss about your implementation idea. 

Mid night Exam Boredum

I have Embedded Systems exam today, so I woke at 12.30 AM to start studying when I was faced with an interesting problem. I got myself an eBook of Raj Kamal, "Embedded Systems - Architecture, Programming and design", Tata McGraw Hill, Second Edition (i do not want to share the link here, a simple google search will get the link on many of the file sharing sites).

The pages were aligned for front-and-back printout. One page aligned left, and another right. It was a pain in the ass to read such a text. So, I wrote a simple iText based java program to correct it. This will run on 2 pasess to correct the even pages of the book to align left and go through the pages to make it in Portrait format from the default landscape view.


PS: This code is for personal use only. Try this at your own risk.  

Tuesday, November 22, 2011

Mini Online Judge - 6 Hour Hackathon

I was too bored and sleepless to study for the semesters. I just made a small try at one of my long time candidate at TODO List - "Creating an Online Judge System". Well guess what, I have came out with a prototype this time.

This is far from being complete, but definitely usable.

PS: This code is released under the pretext that it might be useful. Try this code at your own risk. Most of the instructions are commented in the code.

Tuesday, November 15, 2011

Image processing meets Semantic Web

For the past 2 days, I was working along side with a friend of mine for her final year project topic title. Her guide restricted the domain only to "Image Processing", for she had something to do with it I guess. Anyways after browsing and downloading about 40+ papers against Image processing. I really liked this one. 

Yesterday we were preparing the abstract and after a lot of local misunderstandings, the guide finally signed her abstract. Today morning, when I was preparing for my semester exam, this thought struck me and I thought before I forget, let me document it. So here it goes.

Image Processing + Semantic Web => My Kind of Vision

If you took time to read the abstract in the above link you would know how wonderfully they had brought about a new data structure for image annotation. Their motive is to build an online collaborative image annotation tool, something like LabelMe. The main feature being - modular design,  ability to import other online (LabelMe, Flickr) and offline (Caltech101, Lotus Hill) datasets into the system. Apart from the text based human annotation, it can also embed the low level image details like color histograms, etc. (I'm relatively new to Image Processing and let me just skip the rest of the details with the thought not to confuse you).

Well, this is already there and she has decided to implement such a tool with other added features like Web Services to search through the annotations, and more. Rest of blah.. blah.. content goes here.

When I was browsing through the dataset of LabelMe, I figured out something. Its like the usual temptation to study well only outside the exam hall. This is what I concluded myself with before I ran out of time for my E-Commerce exam.

I've a collection of images that were annotated very well (atleast decently well) to make it more human friendly and make the images more semantic. That does not make it Machine friendly does it? (or am I being carried away?) 

Another MIT Media Lab project is ConceptNet5. This provides general usage common sense knowledge to the computers in my most favorite language - JSON. It contains around 15+ million entries in it. 

So, this is what I'm going to build (hoping my guide would approve it). 

  1. Taking the existing annotated image dataset, apply ConceptNet common sense knowledge (on the annotation of the images) to make the images more semantic and machine friendly. Now at this point machine can learn from the annotated image - Let us call it TEACH mode
  2. Build an index of properties with all the available images to enable image or object recognition via a Search Interface - Let us call it SEARCH mode
  3. Build a reasoner on top of the ConceptNet relations and annotation currently processed. This helps make conclusions based on the image annotations - Let us call it PROCESS mode
Possible Applications 
  1. Now, combined the power of Relations and Reasoner I would be able to fetch dynamic content (from Google and Wikipedia) about a particular annotated object within a image when queried. 
  2. With the power of Image index thus created, I can recognize the object and thus automate the process of annotation. thus providing the previous application in a more automated way. 
Things I have to learn and random notes
  1. Basic image properties that I would be indexing with the images
  2. May be use SVG or a another custom data structure to store the index
  3. Bridge the relations and concepts to Predicate logic reasoning
  4. Modify a Machine Learning algorithm (something like Naive bayesian or even more sophisticated to make the image learning possible)

If anyone of you find any more features worthwhile to be added into the system, please feel free to post it as a comment. Probably this Idea isn't new at all. I didn't take time to Google about it. Let me know if its already there, probably we can build something really even more useful on top of that. 

Wednesday, October 26, 2011

MySPARQL - Multimedia Database for Semantic Web

Following is just a specification for one of my hacks, I wrote a long back and thought  I can share it the world. If anyone interested in this, or already know something like this please feel free to comment about it, I would love to look at it.

PS: Document is very informal and basically a note of what came to my mind. Any help improvising it, is also much appreciated. Thanks!




Tuesday, October 25, 2011

From SQL to NoSQL - SubhDB

It is so nice to be back home after months. Especially so that I can get my hands on my computer, God it feels so great. Been playing around for past 2 days with Document Oriented datastores like MongoDB, RavenDB, CouchDB, etc. I liked them all. Most of them required me to install them as an application on my server to actually do something really useful. Since I only own a shared hosting and not a VPS or a Cloud, it was impossible for me do it.

I wanted to play around with it on my existing LAMP stack that I was given. So, I created one Document-Oriented datastore for myself.

Please say hello to SubhDB. Abstraction of document-oriented datastore over traditional MySQL implemented with PHP.

Inspired by the diagram on the home page of RavenDB (http://www.ravendb.net/Uploads/WindowsLiveWriter/RavenDB_C707/image_thumb_1.png), I designed the datamodel for the datastore. Still it can't store array attributes yet.

You can find the source code and instructions to give it a test drive at GitHub. Please post what you feel regarding this. I searched for any existing implementations and was able to find none.

PS:
  1. One day hack, which allowed me to spend some useful time in my home during the holidays. 
  2. Well this is not even close to being complete or stable. 
  3. No comments regarding the name of the project please. 

Saturday, October 15, 2011

Got Placed!

Title says it all! I got placed and yeah, its been around 2 months, and there were many interesting things that went on actually. I'm not sure, If I'll cover them all up, but will try to do it.


I got placed in Mu Sigma (http://www.mu-sigma.com/) as Business Analyst. Yeah, yeah I know; its not an IT company. Its a business oriented company. You see that actually gives me all the more reason to join here. These people come up with the solutions to the business problems (now, do not ask me yet for meaning of business problems) for their customers. Which goes without saying the amount of data they will be processing, sure require some deep data mining techniques to get it done.


Guess what more? MMBUZZ (http://www.mu-sigma.com/dothemath/id_mtex.html), this particular project got me the job in the first place. Social Heat aka Blue Ignis project of mine does the same work! Vice president of the company was impressed and yeah, he offered me a job. I got a laptop bag, along with my offer letter. All the more reason to join. :-P

I've also been on the hunt for some internship programs lately, got a couple of offers. Yet to decide which one to take.

Final semester, with hardly less than a week of working days left in my college life; What have I learnt? What does it mean?

(After almost 1 hour of thinking)

Aahh.. I'll put that up later not right here.

Happy thing, I got placed. Thanks to SASTRA and My Mentor. 

Monday, August 8, 2011

SASTRA PWI SMS

After a practical suggestion from Shankar Ganesh, this time I'm providing the SMS method of accessing PWI content to the students.

Ingredients - PWI, txtWeb and everything nice

Keyword to use - @pwi
Number to send SMS - +91 92433 42000

Usage - Send a message to @pwi to know the complete list of commands and their usage.

Parts like Attendance, hostel details, internal marks are not yet available, I'm working on it, and will update it soon. Just give it a try and tell me what you feel about it till now.


Sunday, August 7, 2011

SASTRA PWI IVRS

You read it right, its an IVR System for SASTRA students. Taking the experience from Yahoo! OpenHack 2011, and equipped with PWI (from Vignesh), KooKoo and everything nice. An idea that stuck me over a week ago. Now during this weekend found time (inspite of a whole day power cut yesterday) to implement the system.

Give it a try - Call up this number +91(040/080/022)39411020, enter the Access Pin as 3823 and follow on instructions you hear. Do remember to choose the option 7 for feedback.

Currently the system is implemented having only SASTRA SRC students' PWI account. Once I get hold of some Main Campus credentials, I'll implement the system for Main Campus respective features like Internal Marks, Attendance, Hostel Details, etc.

PS: Sorry Tamil Nadu number will be updated soon.

Sunday, July 31, 2011

When Pain and Loneliness meets Passion and Creativity

After around 2 solid months, this is my blog post. Lots and lots of things to share, but right now, I'm in Yahoo! OpenHack Day, 2011 @ Banglore. Its been really cool(literally and figuratively) here for the last 2 days. Got to see the person who brought this world - JSON and most commonly termed as the Father of the JS - Lord Douglas Crockford.

Gave 3 awesome presentations throughout yesterday evening on Various topics like
  1. Server sidedness - Mostly about Node.JS and the design architecture of the system using server side JS.
  2. JSON Saga - History of JSON and how he brought it to the world, rather discovered it from the nature - as he put it.
  3. Programming Style and Your Brain - Here he talks about programming practices and styles. How to choose a particular programming language for doing something and how not to. Really super cool stuff. My personal fav. of all the three.
Apart from him, there were many Y! Engineers, H/W and S/W enthusiasts. Really rocking dance performances, super cool lunches and dinners. An awesome weekend, if you would ask me.

Coming to what I did for the 24 hours of Hacking?

Like applications you've already familiar with - Team Viewer, Remote Desktop Connection, etc. That enables you to access your remote desktop securely using some of the most secured and the complicated protocols, I do the same. Except with a small difference that you'll be controlling your device (not just computer) remotely via a phone call. Using a special API one can build any application or a service that can access the content on Remote Mobile and execute them on your device.

You can do a test drive now for free (Its in AK Labs now). All you need to do is follow this simple set procedure.
  1. Go to Remote Mobile website and register a account for yourself over there.
  2. Download the Windows Client (sorry currently only windows service exist, that was what I was able to make in < 20 hours) to your computer and install it, no virus - I swear!
    You'll automatically be re-directed to the download page upon successful completion.
  3. Login to the app using the email and password you used during registration.
  4. Upon successful login, you can see a Access PIN on the right side of the app, below your Email Id and change password option.
  5. Now, its time to add some commands to your profile.
  6. Click on Add Command and add any number of applications you want to your profile. I'be added some for samples (see screenshot below)

  7. Once you're done call this number - (040/080/022) - 39411020
  8. Enter the Access Code as 6623.
  9. Now enter your Access PIN number visible on the screen.
  10. The system will read out the commands for you like in any IVRS. Enter the option which you want to execute, and see your monitor, before the call gets cut. You can see the program gets executed in your system.
Well that's it! Do some playing around with the application and the platform. This is an individual hack, and there are many known errors still that exist. A 90 second prototype is to be presented in about an hour here in this hall. So, any feedback on the UX, UI, the system itself is highly appreciated.

Tuesday, May 10, 2011

Fo2Hub - Launching Soon

Yeah you got it right. FotoHub has become Fo2Hub. Reason? Simple, I was not able to get a domain name as fotohub.com, so I got one for fo2hub.com. A million thanks to my bro, who got me a new domain and hosting. I'm now under no restrictions, and no problems. Thanks a lot bro!

Fo2Hub - Photo Sharing for Social Nerds, will be launching soon. I've currently put up a small landing page on the domain. Since, my main domain ashwanthkumar.in is also being moved. It might be currently unavailable for max. 24 - 72 hours. You can still access a read only archived copy at here, This location will not be updated anymore.

Fo2Hub.com - Launching Soon!

As always, any feedback or suggestions are always welcome.

Sunday, May 8, 2011

FotoHub - Photo Sharing for Social Nerds

Update 10/05/2011 - FotoHub is now Fo2Hub. You can find more details of it here.

I was just updating my profile on my website, when I realized it would be so nice to add a personal gallery. Flickr was the obvious choice. I logged into my Flickr account to realize only that I've not logged in since my 11th grade. He.. He.. So, what do I know? Picasa is already available on My Google Profile, but I've many of my photos on my FB too. What about the one's that I am tagged in? Oh My God! Maintaing a profile is really a pain in the ass. Grrr..!! That does it.

I'm going to start my own photo sharing web application. One, that can help synchronize from Facebook, Picasa, Flickr (and many more). An application in which probably I can set to synchronize specific albums, from my social presence online. It basically targets people with a very large social presence (not exactly like me, but ... you get it right?)

Welcome a new baby to AshwanthKumar.in Labs, FotoHub - A Photo sharing space, for social nerds.

"Instead of maintaining a separate accounts for the purpose, why not create an album syndicating pictures from multiple sources (probably with your tags, which will be added shortly after release) and accessible via an easy to use API?", the thought that got me designing FotoHub for public.

Though Social Heat is going on good, and due release for (pre-beta) public testing very soon. FotoHub seems to be a good stress reliever for me. Watch the space for more updates on FotoHub.

PS: Still I'm keeping my eyes and ears open, for people who can still suggest me a web application (not android app or desktop picture viewer) that can do this job for me.

Wednesday, May 4, 2011

iBlue Semantic Extractor - Now Speaks 52 Languages

Another Good morning to you people. Today, I've an interesting update regarding iBlue Semantic Extractor. Semantic Extractor component was initially build as a project prototype for Mr. Guruprasad S. Srivatsav back last year. It simply extracted entities out of the text, find their relations, and categories. When it was first created, it had a 1000 word limit, and English language only detection.

Today, I present you iBlue Semantic Extractor with 52 language support. You can enter the text in any of the 52 languages, and it is translated to English dynamically and then identifies the entities, relations and categories.

52 Languages supported are: Afrikaans, Albanian, Arabic, Belarusian, Bulgarian, Catalan, Chinese Simplified, Chinese Traditional, Croatian, Czech, Danish, Dutch, English, Estonian, Filipino, Finnish, French, Galician, German, Greek, Haitian Creole, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Latvian, Lithuanian, Macedonian, Malay, Maltese, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Thai, Turkish, Ukrainian, Vietnamese, Welsh, Yiddish

You can visit iBlue Semantic Extractor Component for free preview here - http://ashwanthkumar.in/labs/iblue/

Next Milestone - Identify the entities natively from the given language.

Tuesday, May 3, 2011

Social Heat - My Social Analytics System


Here is the gentle introduction of Social Heat - Social Analytics Engine.

If you have used Google Analytics service, you might get an idea what exactly I'm trying to say here. Unlike Google Analytics, which gets you detailed analytics for your website visits. Social Heat, gives you the detailed analytics report of your product on Social Networks (like Facebook. Google Buzz, Twitter, etc.)

Inspired by the following video for Viral Heat, I decided to spend this summer vacation creating a system on par or more sophisticated than Viral Heat.


So, I decided why not create one for this summer. Currently Viral Heat service is not free, but its really cheap starting at only $9.99 per month. It comes with unlimited impressions for your profiles.

Social Heat, following the philosophy of Google, will allow free usage which should suffice more than enough people around their world. It also has paid plans which will be disclosed later. I plan to provide more services than Viral Heat, which will be rolled out eventually following the public release.

Features of Social Heat:
  1. Featured report on Users world wide from Facebook, Twitter, Google Buzz, YouTube and Blogs (blogging is also a part of the social media)
  2. Export your reports in one of the many formats (CSV, Excel, PDF, XML, etc.)
  3. Create profiles and add Mentions to it. Each Mention comes with unlimited impressions. This feature was inspired from Google Analytics and Viral Heat dashboard.
  4. Social Heat has inbuilt machine learning capabilities. If you ever think an impression is not related to your needs, you can always teach the system not to display such kind of feeds again. Similarly, the reverse - If the system has not found an impression to be related to your product, you can add them to your report. System is completely dynamic.
  5. Social Heat, comes with extensible API. You can access your data via the robust API platform provided by Social Heat. Probably other developers can create applications which can consume your data and give a much more detailed report to you. - Inspired by Google Analytics API and applications built on top of it.
Now, deploying a system for general public usage is far from near, moreover I neither have enough infrastructure or capital to create one. So, I'm on the hunt for Angel Investor for this adventure. If you are interested and like to get along, feel free to contact me on ashwanthkumar@googlemail.com. If you think Social Heat, is missing some features - feel free to suggest them. We'll try to incorporate it.

Watch this space for, Social Heat is fast approaching for closed public testing soon.

Project UIM - Going Good

Update May 3rd, 2011 - Now UIM consumes Blogger feeds too. :)

For over a week or is two? I don't seem to remember. I started working on something, I would like to call it as Project UIM (Unstructured Information Management). Its basically iBlue - mini, something like that. Trying to maintain a repo of largest Structured Information for computer systems.

I took the following site as my source of content - Gizmodo, LifeHacker, Mashable and Tech Crunch. For almost a week all their articles posted, within an hour comes into my system for processing. Using the Semantic Extractor Component (available here for preview), I was able to extract some useful re-usable information (yeah! re-usable information) out of them.

I just ran some diagnosis on them today, and would like to share some stats regarding that. My system has consumed over a 1061 (at the time of writing this space) articles from the above said sources, and identified over 3368 entities (in over 34 types), 18 categories and 46 relations.

Below a stats of the entity count and its type. :-)

count

type

124

City

522

Company

5

Continent

64

Country

6

Currency

4

EmailAddress

7

EntertainmentAwardEvent

115

Facility

11

Holiday

852

IndustryTerm

2

MarketIndex

10

MedicalCondition

24

Movie

19

MusicAlbum

29

MusicGroup

18

NaturalFeature

15

OperatingSystem

154

Organization

617

Person

13

PhoneNumber

1

PoliticalEvent

380

Position

80

Product

11

ProgrammingLanguage

39

ProvinceOrState

49

PublishedMedium

3

RadioStation

8

Region

10

SportsEvent

5

SportsLeague

114

Technology

16

TVShow

1

TVStation

28

URL


That's in my opinion a very decent for a pre-alpha implemented system.

I'm planning to expand the data source, to try to see how well Project UIM can tame the beast, Internet. Any other suggestions regarding the same is welcome. Please leave them as a comment.

Sunday, May 1, 2011

Measuring Blogger

I was fast asleep when I suddenly woke up, at a dream. Dream is all about building the world's biggest structured infrastructure for information retrieval. When I wanted to blog about it, I was thinking. How many blogs does Blogger (my humble blog service provider) host?

I went to Google APIs page to see if they provide me a API or any kind of means to get the list. I was not able to find anything. Not even a clue :(

So, i decided let me measure the Mr. B myself.

Here is the live stats of the counting process - Counting Blogger Blogs.

Probably what next? Counting WordPress blogs, Joomla Installations, Drupal installations, Elgg installations? What is happening to me, Oh My God!!

PS: Only counting for now.

Friday, April 29, 2011

April 2011 - A Replay!

I just couldn't ask for more. What a month it has been! I'm suprised at the level of things I was able to do in just a matter of month. All credit goes my you people, readers of my blog. For all the things that I came up with you people were always there to give me feedback and keep me motivated. So, April 2011 isn't yet over. So am I.

This month saw the end of many beautiful things of my life.
  1. GSoC 2011
  2. VI Semester college life
  3. End semester practical examinations.
I don't know how to put it, but I'm sort of happy still. I did a lot of things this month -
  1. Got myself introduced to Elgg and its core.
  2. Created a SPAM detection plugin for Elgg 1.8
  3. Social semantic viewer was the start of this month's first facebook app
  4. iBlue - Semantic Extractor Engine was re-written from ground up to improve its performance.
  5. iBlue - Sentiment Analyzer, a simple implementation to test Bayesian filters using the Multi Domain Sentiment dataset
  6. Social Heat - This baby was just born today early morning in my Labs.
Social Heat, is my new application in development since yesterday evening. What it is, how it works, and other details will be out soon. I just want to give one thing away now - If you belong to kind of people who thinks, Social Networking is all crap, and Facebook is $hit. This little piece of code compiled together will sure revoultionize your business.


Tuesday, April 26, 2011

Sentiment Analysis - iBlue Component Preview

iBlue is fast evolving from a dream to a reality. After working on Semantic Web in depth for more than a week, I learnt many principles, theorems (I didn't even bother to study theorms' for my Maths papers), standards, and many more.

You might be had a look at my Spam Detection plugin for Elgg, here. It was during working for this, I came to know about Bayesian Filters and its usage in SPAM Detection, text classification, etc. It is one of fundamental machine learning techniques. Blah.. Blah.. You can find more details at the wikipedia - http://en.wikipedia.org/wiki/Recursive_Bayesian_estimation

Thats it with the introduction, and now I welcome you all to test drive my very first Bayesian filter implementation - Sentiment Analysis. It is every similar to the working of Semantic Extractor, except it gives only one information about the text.

Is the given text a positive feedback or negative?

It returns the final percentage of both the cases. So, just go ahead and give it a try.

Technical Specs for Nerds - The filter was trained using the test data from the dataset provided by Mark Dredze in their Multi-Domain Sentiment Dataset. I took around 25000 Amazon reviews from the dataset to train the filter, from multiple product categories.

Demo for : http://ashwanthkumar.in/labs/sentiment/sentiment.php - It uses the content from Mashable for Samsung Galaxy S Android Smart phone (Link: http://mashable.com/2010/07/26/galaxy-s-review/)

Any feedback is highly appreciated.

Saturday, April 23, 2011

Semantic Week

If my entire last week was with Elgg, this entire week was with Semantics. Semantics in various aspects, you should have already seen my Semantic Social Viewer (http://ashwanthkumar.in/labs/facebook/), did some really super stuff with Guruprasad S. Srivatsav for his final year project.

I've been looking around WWW for some Social Computing stuff, and things. Found a lot of new ideas, to work on this summer.

Most important of all my GSoC results are coming out on 25th of this month. As, I've already told, I've no hope for that at all. I got introduced to a guy, whom I check'd out to only find, that he has applied to the same Organization, and same topic as mine, with 150% more sophisticated profile. So, the dream of $5000 to fix my PC is all lost! Still, #elgg is always my best friend. I'll continue to contribute to Elgg always. I'm waiting the results to release my plugin.

So, nothing more interesting yet! Got my semester practicals coming up. Yeah, its the same procedure of Ctrl + C, & Ctrl + V and some 30 min vetti time waste. Pracs all over. Theory, is all ready to upfront me, while I ain't.

One more to go after this, count down started! =D

Thursday, April 21, 2011

Social Semantic Viewer - Facebook App

Update 26/04/2011 - Added Export Image option to the app. Now you can export your network as PNG and share it on FB.

Ladies and Gentlemen, please allow me to introduce - Social Semantic Viewer. Your next generation (kind of) IDE to your OpenGraph at Facebook. Okay.. Okay.. I get it!

Its YAVP (Yet Another vetti app) from me to kill time. It uses some of the most widely used open source components to help visualize your OpenGraph.


Login with your Facebook account, and click on the type of visualization you want to see. Currently system allows visualizing your friends and likes, but more will be added soon.

TODO:
  • Add a control box, that help you control the graph to decide what you want to visualize.
  • Add more common user actions. Currently system supports posting to any friends' wall.
  • Extend the Graph to Social Graph (Google), Twitter Graph, LinkedIn Connections and extend the same functionality to Elgg (Elgg is on the process).
Privacy Policy (No blah blah.. Please read on): We're not storing your graph or any personal details that help us identify you. 95% the processing are done at the client end. We crawl your graph on demand, and display them to you. Its a slow process actually.

All you've to do is to authorize the application, and click on Start Crawl link at the bottom of the screen, and you're good to go.




There you can see my graph being generated with 300+ friends on the screen. As we explore more, your graph might become a bit slower. We've tired our best to make Hardware Acceleration of Flash, and yet it might be a bit slower.

Known Issues: You might get some Error #1099 or Error undefined at times, after which please reload the page, it happens when you've slow internet connection or when FB Graph API server doesn't respond.

Any sort of feedback for the same is welcome.

Sunday, April 17, 2011

Elggified Week - 100%

Hey people, It was last week when I started working on SPAM Moderation plugin for Elgg 1.8. Since then a lot of improvements, exciting things been happening in my world.

Man, I just love it.

Officially, I'm a 100% Elggified developer. U got me Elgg!

In my SPAM Moderation plugin, I've integrated 2 external services (Akismet and Mollom), and 1 simple internal service that helps you validate any text with the list of keywords, that you've specified. I know, I know, it sounds a bit silly and simple, but thats a start right?

Okay, having my final internals this week, so no intense work for week(s) to come, I guess. I desperately need to cope up with my so-called syllabus.

Interesting things this week was:
  1. Wrote my first Elgg plugin, released it on GitHub and for the first time in my history, released an OSS (Licensed under OSS License).
  2. Got to implement them in a new server (for testing purposes actually =D)
  3. Learnt about some SPAM filtering techniques, to implement them off shelf. (Next Elgg plugin)
  4. Completed my long queued RECORD work for this semester, Hell Yeah!
  5. Learnt that there were 54 applications, from 46 students, and known 12 bug fixers, all this but only 3 slots, for Elgg in this year GSoC. (I actually completely lost hope!) - But, that doesn't make me hate it, rather I <3 Elgg even more now. :D
I haven't released the spam moderation (alpha) plugin to the community yet, should I do it? I'm a bit afraid and tensed to do that. Not much response from outside about that either.

Will this alone earn me a slot in GSoC for Elgg?

Hoping for the best! See ya, until next time, with more elggified updates.

Sunday, April 10, 2011

SPAM Moderation Elgg Plugin

Update 14/04/2011 - SPAM Moderation Plugin has been updated to support keyword based filtering, and Mollom SPAM filtering service is also integrated. Check out the github repo.

Today I started by journey with Elgg. I wanted to get used to the functioning of the system, so I was reading the Docs (http://docs.elgg.org/). When Mr. Akshat, asked me for a plugin that could do SPAM detection for content posted by students. So, I thought its time I dive in for some action with Elgg, and here is the outcome of 3 hours of Coding - A new SPAM Detection plugin for Elgg using Akismet web services.

To add a feather to the work, I feel so proud to say, this is a proposed project idea in Elgg for GSoC 2011. I do not know, if anyone is working on it. If so, please consider this to be your ground work.

I'll post it on my github and will also submit to Elgg community for feedback. Also note that this plugin is in pre-alpha stage. Only the blog posts are checked for spam. I'll add more content check in the upcoming days.

GitHub Repo - https://github.com/ashwanthkumar/spam_moderation

Friday, April 8, 2011

Final Proposal for GSoC

You can find my GSoC 2011 proposal here at (GSoC Official website) http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/achu/1.

You can download a copy of PDF from here (http://goo.gl/qzjTD)




Any comments are welcome. Please use the comment to let me know of them.

PS: This is just an update post.

- Ashwanth Kumar

Tuesday, April 5, 2011

Semantic Web dream with Elgg

In continuation to my previous post here, I've decided to come up with a new idea. Something, that Elgg and its community wants, sooner or later on the web. Allow me to introduce you to, "Elgg Graph". A graph that explains users, their posts, comments, activity, everything on the Elgg community anywhere around the world.

Basically, this is what I've in store for Elgg (during and post GSoC). Elgg Graph, is a local entity (site specific) that explains all the objects and their relationships. Probably we'll try to add a feature to visualize it by the admin at the admin panel (not now), like LinkedIn Inmaps, etc.

Once the graph is perfected, and things are set up, we need to build a solid strong API on top of it. Graph does not allow direct editing. We need to provide a web service on top of the existing structure to modify the content, or it can always be done online through web interface by the users.

Once the Elgg Graph with its API is done, only now we would be able to provide a perfect social application development platform on Elgg.

Some time later, we can try to set up a global repo, for the community to share their graph, and establish inter graph communication (this is when we provide true semantics to the end user) between them. This repo, can consume the graph in a format of Semantic Web like RDF, etc. Hence, with the permission of the site administrators, we can contribute the content to the LinkedData, which can be consumed by other apps and websites on the web, to increase their productivity for the user.

Of all this, I'm planning to implement the Elgg Graph, API, and provide web services to the basic graph operations for this GSoC. There was a project idea in Elgg Ideas page for RDF kind of stuff also, may be someone is already working on it. I'll try to implement the rest post GSoC 2011.

The idea or the dream, i've tried to explain is in very high level. Once I start working on the various aspects of these, I'll decompose them and keep you posted :)

Any comments, or feedback on this would be highly appreciated. As I mentioned earlier, I'm new to Elgg. I'm trying to discover new and interesting aspects of it, almost every day. If I'm missing out anything here, plz correct me.

Update: I'm also going to work on RDF content export for Elgg, with Jena TDB and Joseki for SPARQL end point.

- Ashwanth Kumar

Elgg - OpenGraph Protocol

Yet another proposal idea for GSoC 2011, Implementing OpenGraph Protocol for the site admins. You can enable OpenGraph plugin, which adds OpenGraph meta information to the site, which can later be consumed by a OG consumer. Any insights on the same?

To all the offline comments, Yes I'm desperate to do something BIG for Elgg.

PS: This is just a stub containing an idea, nothing more.

- Ashwanth Kumar

Saturday, April 2, 2011

Elgg - A Gentle Introduction (with GSoC 2011)

Well, i started the hunt for GSoC in 2007, and atlast found myself comfortable to participate this year. I've hardly 3 days left to submit the proposal, which i've not started working on. Before i continue, I would to introduce my new friend. Elgg, an OpenSource Social networking platform.

I was introduced to Elgg, last month by Google (Search), and wow, i found it on GSoC, and i started digging into it, its source code, made a couple of patches (didn't submit it though). This post is for getting comments regarding the proposal, that i'm going to make.

Elgg has a large and vibrant community online, many ppl contributing towards it, developing it, etc. Things going on good no problem at all. So, I thought, if Eucalyptus (Cloud OS - like in Ubuntu Server editions) is insipred by non-standard Amazon's Features for its S3, etc. Y not get inspired by Facebook for Elgg?

So, my proposal is going to be on making a solid platform for users to develop apps for Elgg powered community websites around the globe.

Imagine: A single app made can be used on any number of websites, round the globe and target more number of users (than Facebook.. may be?!).

It can be further extended (I would to work on this too), if multiple Elgg installs (on the internet) can communicate with each other, then a global presence for a user can be achieved. May be, Facebook wants to buy over Elgg some time in future, or may be start an Online Social N/W war?

I just love the code base of Elgg so much, I've decided to work towards this at any cost. Be it, getting selected for GSoC or not, I'm working on it, and the work has already begun. If i dont make into GSoC, you can see my Elgg (sense of freedom), evolving into iBlue Social soon.

Actual details, work tasks, designs, etc. coming soon.

Update: I've asked the Elgg Dev. community for their comments. If you happen to read the post, plz post your comments here too; I'll be following both. ;)

- Ashwanth Kumar

Saturday, March 5, 2011

Windows Hibernate - Interesting feature

When i restored my gear with a new Intel DG41CN motherboard and Transcent 2GB DDR2, i found an interesting problem with my Windows 7 boot screen. My CMOS battery was screwed up in the brand new board, and upon system boot, it gave me a CMOS Battery failure, event log screen.

On my last usage 6 months, ago i had hibernated the system. Now, the time is reset to factory default 01/01/2009 00:00:00. When i tired to restore my system from hibernation, it gave me a checksum error and deleted that partition to start from first again.

My conclusion, Windows 7 (or may earlier) checks the hibernate page file with the system time, and last modification time, and only then boots. Have anyone else faced this problem (rather feature) of Windows 7?

Restored my Gear

Ha, A very gud and pleasant news after a long time. Feeling heaven in home, when my PC is back running after say almost 6 months (last working date - 19/09/2010) :)

Just wanted to share the word with all you people. I'll be resuming my iBlue and Blue Spider work very shortly. Watch this space for more info very soon.

- Ashwanth Kumar

Thursday, February 17, 2011

RE Semantic Analyser - Pre Alpha

Hello folks,

After a long time, something actually technical coming up here. Actually after hearing about GP (Guruprasad, my senior)'s project (again) from Salai. I and him, have decided to put up a small working prototype. Working towards the same, i came across an interesting problem suitable even for my RE iBlue - Semantic Analysis.

For my KBEngine implementation, I downloaded structured content from Wikipedia in readymade RDFs and used data store like Jena TDB to store, index and retrieve the contents upon depend. Now, its time I start structuring my own data from any data source.

Link: http://ashwanthkumar.in/labs/iblue/

Above link, shows you a pre-alpha working prototype attempt in doing so. It uses various available webservices for structuring of data, mainly using Apache UIMA for processing. The hosted version uses external content to process, while the actual RE implements the same using Apache UIMA in its own infrastructure.

Link: http://ashwanthkumar.in/labs/iblue/iblue.php

Above link shows you a sample of the analysis format using this blog post content as its source.

Have a look at it, and please do comment your feedback.

PS: Well I'm, thinking of submitting this concept with my RE for Daksh 5.0 Technovation.