In my previous post I provided some detail on the data cleansing and manipulation steps that precedes the SNA. (I’ll provide the detailed steps in an annex to the report.) In this post I will provide an early temporal analysis. At this stage I still have not completed the detailed SNA, but I find these steps useful in understanding the dataset. Of necessity they require me to look deeply at the data.
As I said in my previous post it was necessary to remove "Anonymous" from the dataset, because "Anonymous" is almost certainly not a single person, and to leave them in would distort the results. The graph below shows the number of posts with "Anonymous" still in the dataset. The blue line shows the trend.
The next graph shows the same data but with "Anonymous", identified pseudonyms, aliases, and duplicate names removed. Note the peak activity in 2008. Pete Cranston thinks it might be to do with the Web2 for Dev conference in 2007, which coincided with an increase in membership of some UN agencies. Lucie Lamoureux is less sure and has provided me with some additional Google Analytics data to review. Can anyone in the community provide clarification or insights? Notwithstanding this question, it would appear that posting activity has peaked. I’ll try and provide a reason after further analysis. One such explanation might be around a number known as Dunbar’s number – more on this in a later post.
The next two graphs respectively show posts by month and posts by day. Again, in both cases "Anonymous", identified pseudonyms, aliases, and duplicate names have been removed. It would appear the discussion group is most active in February and October, with most posts occurring on a Wednesday. The differences are statistically significant.
Deeper analysis shows most posts occur between 10:00 am and 2:00 pm Greenwich Mean Time. Now I don’t have the context to explain why this is the case, so I’ll leave it to the group to ponder. What I do know is that if I wanted an answer to something I’d post it on Saturday or Sunday, knowing that I’d likely get a response on Wednesday.
In my next post I will provide commentary on KM4Dev and Dunbar’s Numbers.