Skip to main content


Showing posts from November, 2018

Google+ Migration - Part VI: Location, Location, Location!

<- Image Attachments Conversion & Staging -> Before we focus on putting all the pieces together, here a small, optional excursion into how to make use of location information contained in G+ posts. We should consider carefully if and how we want to include geo location information as there might be privacy and safety implications. For such locations, it can make sense to choose the point of a nearby landmark or add some random noise to the location coordinates. Many of my public photo sharing post containing the location of near where the photos where taken. Diaspora* posts can contain a location tag as well, but it does not seem to be very informative and the diaspy API currently does not support adding post a post location. Instead we can process the location information contained in the post takeout JSON files and transform it to extract some information which we can use to format the new posts. In particular, we want to include a location link to the corre

Google+ Migration - Part V: Image Attachments

< - Part IV: Visibility Scope & Filtering Part VI: Location, Location, Location -> Google+ has always been rather good at dealing with photos - the photo functions were built on the foundation of Picasa and later spun out as Google Photos. Not surprising that the platform was popular with photographers and many posts contain photos. In the takeout archive, photos or images/media file attachments to posts are rather challenging. In addition to the .json files containing each of the posts, the  Takeout/Google+ Stream/Posts directory also includes two files for each image attached to a post. The basename is the originally uploaded filename, with a .jpg extension for the image file itself and a jpg.metadata.csv for for some additional information about the image. If we originally attached an image cat.jpg to a post, there should now be a cat.jpg and cat.jpg.metadata.csv file in the post directory. However if over the years, we have been unimaginative in naming files and

Google+ Migration - Part IV: Visibility Scope & Filtering

<- Part III: Content Transformation Part V: Image Attachments -> Circles and with them the ability to share different content with different sets of people was one of the big differentiators of Google+ over other platforms at the time, which typically had a fixed sharing model and visibility scope. Circles were based on the observation that most people in real life interact with several "social circles" and often would not want these circles to mix. The idea of Google+ was that it should be possible to manage all these different circles under a single online identity (which should also match the "real name" identity of our governments civil registry). It turns out that while the observation of disjoint social circles was correct, most users prefer to use different platform and online identities to manage to make sure they don't inadvertently mix. Google+ tried hard to make sharing scopes obvious and unsurprising, but the model remained complex,

Google+ Migration - Part III: Content Transformation

<- Part II: Understanding the takeout archive  Part IV:  -> Visibility Scope & Filtering -> After we have had a look at the structure of the takeout archive, we can build some scripts to translate the content of the JSON post description into a format that is suitable for import into the target system, which in our case is Diaspora*. The following script is a proof of concept conversion of a single post file from the takeout archive to text string that is suitable for upload to a Diaspora* server using the diaspy API. Images are more challenging and will be handled separately in a later episode. There is also no verification on whether the original post had public visibility and should be re-posted publicly. The main focus is on the parse_post and format_post methods. The purpose of the parse_post method is to extract the desired information from the JSON representation of a post, while the format_post method uses this data to format the input text needed to