Category Archives: Cojiro – Cross lingual curation

Project Description
Our group is working on an open-source tool, “Cojiro”, designed to enable people with complementary skill sets to identify, group and convey stories in one language to a broader audience in another language. The tool is based on the idea that in order to effectively bridge language barriers, content should only be translated if there is an audience who will actually read it. To do this, Cojiro appeals to two key user groups to narrow the focus of translation: domain experts in the source language, whose knowledge of local contexts and specific areas is essential to uncovering and grouping interesting conversations, and readers in the target language, who can evaluate which of these conversations would be of interest to foreign audiences. Closing the feedback loop between these groups would make cross-lingual sharing and collaboration a much more seamless process — and, we believe, a much more interesting and exciting one. Prototype: http://beta.cojiro.jp/ (UN/PW: guest/brain) Software codebase: https://github.com/netalab/cojiro

Update on Cojiro

Hello!

*taps microphone, dust falls to the floor….

It’s been quite a while since our last post but things are chugging along and we’re happy to share the news. (And also happy to see that fellow projects are also still live and well! Hi Juan!)

First, a re-acquaintance of the problem that Cojiro aims to solve.

A typical flow for writing a GV article is something like this –

  1. Decide on a topic
  2. Hunt for good links – blog posts, tweets, videos, images
  3. Assess, prioritize, and order those links into a cohesive storyline
  4. Translate snippets from 3.
  5. Write the article

Normally, this is done by one person (the Author) with the occasional help with 2. and 4. from their language/regional GV community via mailing lists and Facebook groups.

We asked ourselves: Is there a better way for non-Authors – and even non-GVers! – to contribute with steps 2 through 4? And if that exists, wouldn’t that form of collaborative storywriting help validate and build interest in a story before it’s even written? That would motivate both GV writers and our audience!

That was the original motivation and also where we left off the last time we shared on this blog.

However, we kept running into the same wall – our idea was too complicated for non-GV friends to understand. As any GVer who’s had trouble explaining the flow of content and division of labor between GV English and the Lingua sites knows, it’s just a very big idea to wrap one’s head around.

It was never our intention to build an internal GV tool though.

As such, much of the work in the past year has been about reframing the service concept. How does one explain cross-lingual content discovery and community building in layman’s terms? And get them excited about it?! This required nailing down the service ideas without depending on the GV context… and this was very difficult, even painful at times.

Cojiro is a platform to share and talk about awesome things on the Internet regardless of its language, with other people who are interested in the same things as you.

We’re a few steps away from launching a sandbox site with the redefined MVP (minimum viable product) feature set, one that we believe works with a broader context. More on that soon.

Sidenote: Cojiro differs in nature from other Innovation projects in that it’s about building tools as opposed to content. This isn’t something we really knew how to do before this project started and sometimes it felt like we’d bitten off more than we could chew.

Channeling a grandiose vision into software specs – and software that we could build and maintain, at that – has proved to be quite an adventure. During this time, Chris has become an awesome engineer and I’ve gotten a ton of experience designing digital products through other channels.

If any GVer reading this post has an idea for a tool that they want to build, feel free reach out to us, even if you aren’t going to do the development yourself. We’d love to share insight from this project – all the good stuff we learned the hard way.

Update on Cojiro platform development

Hi everyone,

Below is a quick update on our work on Cojiro over the past couple months. This being the first post since the award winners were announced, we’d like to say thank you to the community for selecting us! It’s so great to have your support, and so great that GV created these awards.

Apologies in advance: this post will be a bit technical since our main focus right now is on building the basic components of the application. I’ll try to keep the discussion as high-level as possible, but if you’re not interested in the technical stuff skip to #4 below.

For an overview of the project see our earlier post and proposal on the innovation blog, and the github page for the project. Continue reading

Project proposal: Cojiro, a cross-lingual curation tool

1. Full name
Tomomi Sasaki
2. Global Voices sections to which you contribute
GV in English
Lingua
3. Publication date of your latest post or translation
Date: – 9/1/2012
4. Title of project
Cojiro, a cross-lingual curation tool
5. Project representative (person who will sign award agreement and receive funds)
Chris Salzberg
6. Describe the proposed project as clearly as possible in five sentences or less
Our group is working on an open-source tool, “Cojiro”, designed to enable people with complementary skill sets to identify, group and convey stories in one language to a broader audience in another language. The tool is based on the idea that in order to effectively bridge language barriers, content should only be translated if there is an audience who will actually read it. To do this, Cojiro appeals to two key user groups to narrow the focus of translation: domain experts in the source language, whose knowledge of local contexts and specific areas is essential to uncovering and grouping interesting conversations, and readers in the target language, who can evaluate which of these conversations would be of interest to foreign audiences. Closing the feedback loop between these groups would make cross-lingual sharing and collaboration a much more seamless process — and, we believe, a much more interesting and exciting one. Prototype: http://beta.cojiro.jp/ (UN/PW: guest/brain) Software codebase: https://github.com/netalab/cojiro
7. What aspect or need of Global Voices does your project address?
Our project addresses three aspects of Global Voices: 1. GV is a closed platform: people cannot contribute without approval from the community of editors. 2. GV does not offer a natural way for people who lack writing (or translation) skills to contribute to content creation in other ways (e.g. through knowledge of specific topics) 3. Translators and writers are separate groups without a natural way to collaborate (decentralized publishing reverses the flow of translation but doesn’t really change this). 1. and 2. are reminiscent of traditional journalism, which separates content consumers from content producers. Having this type of barrier prevents the kind of spontaneous, low-threshold contributions that power popular community sites such as Quora (expert answers), Amazon (product reviews) and Slashdot (news curation). There is no lack of tools and platforms which enable open contributions of the kind described in 1. and 2., and indeed GV has used many of them (blogs, wiki, etc.). But the real problem — and the real opportunity — comes with the third point: most of these platforms do not deal with languages in a way which would enable Global Voices authors, Lingua translators and their community of readers to collaborate freely in story-writing. Cojiro is an attempt to fill this void.
8. How would the project further Global Voices’ mission?
Cojiro will further GV’s mission by creating a shared space for GV authors, translators and readers to collaborate in content curation and creation. This would add transparency to the story-writing process, which currently is mostly hidden from readers. It could also bring down the barriers to cross-regional collaboration, which has been greeted with enthusiasm at summits but achieved limited success in practice. The process of actually “building a tool” would also, we feel, be beneficial for GV to be a part of. The Global Voices website states that GV will “work to develop tools, institutions and relationships that will help all voices, everywhere, to be heard”, but in practice GV has mostly used existing tools rather than built its own. This is unfortunate given the growing importance of tool-building to modern news-gathering (see data journalism, for example), and the unique set of problems that GV faces, which often go beyond the features of popular web services (e.g. translation in Storify).
9. What is innovative about your project?
Most projects tackle language barriers with one of two solutions: human translation or machine translation. A project like Meedan, for example, focuses on how to efficiently combine both to create a space for parallel conversations in Arabic/English. We look at the problem differently, not as a translation problem but as one of combining the right sets of skills in the right way. Our approach is to minimize translation by targeting only the most important and valuable elements of a story, and situate this translation as part of a larger curation workflow. Full texts are never translated and language agnostic content such as images play a strong role. This approach gives a clear sense of purpose to the translation, and allows for useful contributions without demanding a large investment of time. A tool like this will impact the way that GV contributors and audiences interact with our content, in a way that highlights the cross-lingual aspect that is unique to Global Voices.
10. Which section of Global Voices would your project most benefit (if applicable)?
GV in English
Lingua
11. How would the wider GV community utilize and/or participate in your project?
We’d love for a subset of the wider community to use the tool in their daily gathering/writing process and provide feedback on product development. This would be accomplished by setting up instances of the tool in the relevant language pairs for a few interested communities.
12. List the other GV community members, if any, who will be actively working on the project. Please specify what role each person will play in the development of the project.
* Chris Salzberg: Product lead & head developer * Tomomi Sasaki: User experience & community outreach * Taku Nakajima: Project development and technology strategy adviser
13. What additional resources or expertise, if any, would you need to complete the project?
Front-end coding and graphic design resources are required to speed up the development process, especially as the service becomes available to more users and we gather feedback. Cojiro receives voluntary support from two companies: AQ, the creative agency where Tomomi works, provided the visual identity and will continue to offer their expertise in building digital tools. Brain Co,.Ltd., the system development company where Taku works, provides consulting on system architecture. http://www.aqworks.com http://www.brain-tokyo.jp/
14. Describe the prospects for sustainability/continuation once the innovation grant funding ends
Cojiro started in 2010 as a volunteer project, and we have managed to build a proof of concept and working prototype without any material support. The funding will greatly help in advancing the project but the project will not necessarily halt once it ends. Currently, we see two future avenues, which are not mutually exclusive: 1) Cojiro could be financially supported by multiple parties and worked on by the original team in their available time, in much the way Global Voices operates. 2) Anyone interested in the tool could participate in its development or fork the codebase to start their own version. Cojiro is being developed as open source software on the github platform under our team name Netalab, and is open to any collaborators.
15. Please specify the timeline for the project, from start to finish
The award will help cojiro reach the next stage, but there is no clear start or finish to this project. Cojiro is currently a password protected prototype that is available to a couple of people. With the funding, we would build an alpha version that can withstand usage from a bigger group of testers. After the project team and our testers are happy with the quality of the service, our goal would be to deploy it to support wider usage. Once more details are provided, we can draw up a timeline in a way that’s relevant to the award.
16. Provide a detailed budget of up to US$5,000 for project costs. (Please try and present as accurate a budget as possible: applicants are encouraged to submit budgets for less than the maximum amount as smaller grants allow us to fund more projects)
Estimated cost for one year ($5,000) * Services and infrastructure costs ($1,700) ** Linode server ($500) ** Domain names ($100) ** Third party services, including Embedly ($1,000) ** License fees for fonts and icons ($100) * Development resources ($2,300) ** Backend development ($1,300) ** Frontend coding ($1,000) * Non-development costs ($1,000) ** Conference attendance fees ($500) ** Incidentals for meetings ($300) ** Books and tutorials ($200)

Platform for cross-lingual curation

Hi everyone,

Tomomi and I and our friend Taku Nakajima have been working on a project for the last year and a half called Cojiro, which we submitted to the innovation awards last week. The project is to build a web platform for what we call “cross-lingual curation”: curating content from one language to another. We’ve actually already built a prototype (see below) which we’re currently testing — if we get the grant then it will fund the next stage of the project, which will be entirely open (for more info see the github page we just setup). Continue reading