D-Lib Magazine
spacer
The Magazine of Digital Library Research
spacer
transparent image

D-Lib Magazine

November/December 2012
Volume 18, Number 11/12
Table of Contents

 

Exploring Social Curation

Michael Zarro and Catherine Hall
Drexel University
{mzarro,ceh48}@drexel.edu

doi:10.1045/november2012-zarro

 

Printer-friendly Version

 

Abstract

This work investigates social curating activities on the website Pinterest, and relates them to digital libraries. Pinterest is a social curation site that combines features such as sharing, liking, following and commenting with the information management characteristics of successful data curation. Effectively combining social media techniques and data curation practices will result in new ways of interacting with Web users, providing insight into the development of useful social media efforts by libraries, archives, and museums, as well as commercial organizations.

 

Introduction

Curating collections of digital objects found on the Web is a popular way of storing resources for future use. While practices like bookmarking, tagging, and downloading support collecting, the popular adoption of high-speed Internet and Web 2.0 technologies enables collecting to become a social activity. The website Pinterest, currently estimated as the third most popular social media website in the United States (Experian Marketing Services, 2012), allows users to easily "organize and share" objects they encounter on the Web by curating digital collections on a virtual "pinboard" . Pinterest is a social curation site, combining the social features sharing, liking, following and commenting, with the curating capabilities of bookmarking, tagging, and personal information management (Jones, 2007). Users of social curating sites create "ad hoc" categories (Barsalou, 1983) conforming to their personal notions. The combination of these social and curating qualities points towards new ways of interacting with Web users, providing insight for the social media efforts of commercial organizations, and libraries, archives, and museums. This work investigates social curating activities on the website Pinterest.com, and relates them to digital libraries.

 

Social Media and LAM

Social curating offers a way for digital libraries to get a "social life" (Marshall & Bly, 2004). While previous efforts to use social media, like those on The Commons on Flickr, have some social elements, they remain organized and controlled by Library, Archive and Museum (LAM) professionals. In contrast, Pinterest users curate collections and add annotations with no supervision or institutional control. Users follow one another and unique collections in a twitter-like following model, forming networks based on shared interests. Digital libraries can learn from these unsupervised collecting activities, and this model may provide a way to extend the reach of the library to patrons who may never otherwise visit that library's physical or digital collections.

Past LAM social media projects have seen success. Steve: The Museum Social Tagging Project (Trant, 2009) incorporates user-contributed tags into descriptions of collections at several museum websites and can be seen in use at the Indianapolis Museum of Art. The Library of Congress received an "overwhelming" response to their images on Flickr Commons, (Springer, et al., 2008), collecting user annotations and increasing public access. The Smithsonian Institution determined to "go where visitors are" by adding their images to Flickr Commons rather than "requiring them to come to us" (Kalfatovic, et al., 2009).

Pinterest's growing popularity makes it a prime candidate for the attention of LAM professionals. We suggest that Pinterest and similar social curation sites can be used to expand the reach of LAM collections and gather user annotations. Social curators create ad hoc categories that are "perspective or context dependent, [and] therefore show a wide range of concepts" (Barsalou, 1983). These categories are created on the fly and represent a multitude of user thoughts, opinions, and judgments that can supplement traditional cataloging.

 

Pinterest

Digital objects are collected on Pinterest in two ways. First, organizations create an account and add their own content. Pinterest is currently used by a number of LAMs, non-profits, and commercial organizations to deliver information, promote brand awareness, and engage with their customers. LAM organizations currently with Pinterest accounts include the New York Public Library, the Smithsonian, and the Philadelphia Museum of Art. Second (and more frequently), as discussed below, Pinterest users add an organization's content to their personal ad hoc collections as the context of social curating allows reusing others' materials (Marshall & Shipman, 2011).

The design of Pinterest as a "lightweight shared place" (Marshall & Bly, 2004) is a likely explanation for its success in comparison to independent efforts on LAM websites (Marty, 2011). Pinterest supports collecting as adding pins (images, text descriptions, and sources) copied from external websites to pinboards (collections). The Pinterest website has a simple grid based layout (See Figure 1) that supports searching/browsing and serendipitous resource discovery. Web 2.0 tools, including a browser bookmarklet and "Pin This" buttons, enable seamless collecting from almost any website. Following are terms that describe Pinterest activities, and explanations from the Pinterest Help page:

  • Repin: "A repin is adding an image you find while browsing Pinterest to your own board."
  • Repinning: "Repinning an image allows you to categorize the image onto one of your boards. You can edit the description of a repin."
  • Pinboard: "A board is a set of pins.... Commenting and boards with multiple users are also allowed. "
  • Like: "Liking a pin adds the image to your profile's Likes section."
  • Follow: Users can follow another user's activity, meaning "you'll have all of a user's pins on all their boards shown to you in real-time on Pinterest" or an individual board, "if you're only interested in seeing a user's pins to specific boards."
  • Comment: Users can annotate any pin (their own or another user's) with text comments.
 
 
Screen shot of a pinboard from the website Pinterest
Figure 1: A pinboard from the website Pinterest.
 
 
Screen shot of a pin included in the Pinboard in Figure 1
Figure 2: A pin included in the Pinboard in Figure 1.
 
 

Research Questions

Exploring the Pinterest website and the literature discussed above led to the following exploratory research questions.

  1. What is the nature of popular pins on Pinterest?
  2. What is the nature of ad hoc categories created on Pinterest?
  3. What is the provenance of pins? Are pins sourced from the original document/image on the web, or are they copies of copies?
 

Social Curation on Pinterest

 

Data Collection

Beginning on February 15, 2012 and ending on March 15, 2012 we used the Pinterest API (since discontinued) to collect the top 25 Pins in the Pinterest system every five to 10 minutes, resulting in a total of 291,125 pins. Represented in this popular dataset are 78,261 unique users, 24,952 unique source domains, and 79,768 unique source URLs. Descriptions range from one to 7,420 characters, with an average of 29 characters. Repins ranged from zero to 80,914 (mean 1,710); Comments from zero to 469 (mean of nine); and Likes from zero to 23,167 (mean of 244).

From the popular dataset, we removed duplicate postings, removed any pin described by less than three characters, then randomly selected 1,000 pins for analysis. The resulting data consists of 946 unique users, 632 unique source domains, and 904 unique source URLs. Eighty-eight pins in this dataset were uploaded by the user from his/her computer or mobile device and contain no source domain or source URL. Descriptions in this dataset range from three to 615 characters (mean of 28 characters). Repins ranged from two to 22,897 (mean of 470); Comments from one to 112 (mean of four); and Likes from zero to 3,352 (mean of 75). Some domains are related, for example the domain name blogspot.com has many subdomains (subdomain.blogspot.com) that contribute to the datasets above. The data collected for each pin in this study (shown in Table 1) consists of user-contributed text in the form of descriptions and board names; and social metrics shown by the repins, likes, and comments counts.

Table 1. Data Collected from Pinterest API used in this study.

Description The user-created text annotation for the pin.
Source Domain The web domain where the pinned image was found.
Source URL The unique webpage where the pinned image is found.
Repins Count Number of times another user repined the pin.
Likes Count Number of times another user liked the pin.
Comments Count Number of times another user commented on the pin.
Board Name The pinboard to which the pin was added.
 
 

Popular Domain Types

In order to determine the most popular sources used for pins in our sample we extracted the most frequent domains from the sample dataset, only those that contributed three or more pins. Using an iterative coding process, we grouped these popular domain names into the categories in Table 2.

Table 2. Domain categories based on the most frequent domains (% of all pins).

Category Description
Blog hosts (20.8%): Journal-style sites operated by individuals or small groups. Example domains include blogspot.com and tumblr.com
Search Engines (7.9%): Image search results from the web search engines Google and Yahoo.
Online Magazines and Magazine-like Blogs (5.1%) Sites that have many magazine-like features: diverse content type (e.g., articles, opinion pieces, and reviews) frequent updates (usually multiple times daily) and many contributors. Examples include Houzz and Better Homes and Gardens.
Image and Video Sharing/Hosting (4.2%) Sites whose primary function is as a place where users can upload and store images and videos. Examples include Flickr and weheartit
Ecommerce (3.6%) Sites primarily designed for the buying and selling of goods. Examples include Etsy and Amazon.
Social Curation (3.2%) Sites similar to Pinterest in that their main function is to collect and curate material for public or community consumption. Polyvore and Piccsy comprise this category.
 
 

Boards and Ad Hoc Categories

Guided by the Pinterest community's top-level groupings, we manually categorized each board in our sample dataset into a category (see Table 3). We found many ad hoc collections in our selection. A substantial number of pinboards are personally relevant categories or relate to home, do-it-yourself (DIY), entertaining, and fashion. Examples include Places I'd Like to Go, For the Home, and Birthday Party Ideas that provide a sense of the collection. Some Pinterest board names, however, follow the "folksonomic flaw" in that board names are "often ambiguous, overly personalised and inexact" (Guy & Tonkin, 2006). For example, the most frequent categorization of pinboards is "Other" exemplified by boards named precious, stuff, or Rob. The second most popular categorization, "My Life" includes boards Love This, Good Stuff, and A Girl Can Dream Can't She.

Next, we searched for Library of Congress Subject Headings (LCSH) that exactly match a board name using tools available at http://id.loc.gov. We found 150 pinboard names (15%) matched a LCSH using this method. All of the matches we found are for board names that are a single word, like; Architecture, Cats, and Shoes. Pinboard names that do not return a match include more complex or personal terms like Favorite Places and Spaces, Dahling ... you look FABULOUS, and eCards I found on the floor. Descriptions of individual pins fare even worse, with just 91 of 1000 (9%) matching LCSH. For both board names and descriptions, all matches were for terms that consist of a single word, while the board names average 2.4 words. Previous studies of social media sites showed a greater overlap between user-contributions and LCSH (Stvilia & Jörgensen, 2010; Heymann & Garcia-Molina, 2009). However, these works investigated tags, which are generally one-word terms, in controlled settings. The low overlap in our data suggests ad hoc categories express concepts not represented in LCSH. Ad hoc categorization might lead to new indexing schemes for LAM by revealing terms for user-centered indexing (Fidel, 1994) and provide a richer description of resources.

The topics of interest to Pinterest users based on our analysis are often from non-scholarly sources (blogs, online magazines) and relate to personal interest topics (home décor, DIY & crafts). One may deem these topics, and thus Pinterest itself, not interesting to digital library professionals because they are not academic materials. We do not subscribe to this approach for two reasons. First, the technology used on Pinterest has attracted a massive user-base in a short period of time. Certainly there are lessons that LAM professionals can use to improve the state of social media services. Second, these topics are part of everyday life information seeking (ELIS) (Savolainen, 1995; McKenzie, 2003), and relevant to Personal Information Management (Jones, 2007). Shared interests serve as discovery tools for the general public on Pinterest, similar to ArtStor's efforts to do this very thing for art scholars at member institutions. Similarly, Wikimedia Commons and Flickr support the creation of crowd-sourced material that are available to the general public, but lack the simple collecting capabilities of a social curation site.

Table 3. Pinboard Categorization

Pinterest Category Boards LCSH Match Example Board Name
Other 165 15 Random
My Life 150 13 Things I Love
Food & Drink 131 33 Recipes to try
Home Décor 81 10 Interior Design
DIY & Crafts 62 11 Feelin' Crafty
Women's & Men's Apparel 52 7 Fashion Picks
Hair & Beauty 45 14 Popular Hairstyles
Travel & Places 44 3 I Want To Go To There
Weddings & Events 40 4 Wedding and Event Ideas
Humor 37 6 Just Plain Funny
Kids 27 2 For the wee ones
Pets 25 6 Cats and Kittens
Design 23 1 Color Schemes
Gardening 22 3 Garden Dreams
Film, Music & Books 18 1 Books
Fitness 16 2 Eat Clean, Train Mean, Get Lean
Photography 14 6 Photography
Art 11 2 Ideas for art class
Holidays 10 4 Easter
Products 9 3 Full Time Etsy Crafters
People 6 1 Good People
Architecture 4 1 Octagonal Barns
Tech 3 1 Gadgets
Science & Nature 3 1 Nature
Geek 1 0 Geekery
Cars & Motorcycles 1 0 I like cars
 
 

Adherence to Pinterest Guidelines

Lesk defined digital libraries as "organized collections of digital information" (1997, p. ixx). Pinterest is forming a bottom-up digital library created by users and enhanced by shared interests and social connections. However, as shown above there can be difficulty creating a digital library with no top-down structure. To address this issue, on the Pinterest Help page the site operators provide the following instructions for the creation of useful pins:

To make Pinterest the most useful to yourself and others, follow best practices when pinning: 1. Pin from the original source. 2. Pin from permalinks. 3. Give credit and include a thoughtful pin description.

Our analysis takes its cue in part from these instructions as both we and the site operators are interested in useful pins. However, the popular dataset shows many users add descriptions that are not thoughtful or descriptive, for example, a description that includes only a single letter. These "non-descriptive" descriptions did not prevent one single letter pin from appearing in one of the top 25 postings on a highly popular social media site, lending credence to the notion that the images alone, as a cue for finding or refinding (Teevan, et al., 2009), may be largely responsible for the usefulness of a pin.

 

Library of Congress and Smithsonian Pins

The prevalence of image sharing and blogging sites in our study suggests many pins are not collected from the original source, but rather are copies of originals. To investigate this question specifically in relation to digital libraries, we studied additional pins of resources from the Library of Congress and Smithsonian. Using the Pinterest search tool we conducted a search for the terms "library of congress" and "smithsonian". Selected from the search results were 25 pins representing an institution's holdings for each search. Our examination of the 50 pins found many images not posted from the original source. Twenty pins were pinned from the official Library or Smithsonian site (including an image from the official Smithsonian store). A total of 13 Flickr images were pinned from the Library of Congress or Smithsonian photostreams and a photo group on Flickr. The remainder (17) came from blogs, online magazines, e-commerce sites, or the Pinterest upload tool.

Less than half of the Library of Congress and Smithsonian pins were sourced directly from the original website. This suggests that images are often copied and re-copied across the Web. Figure 3 shows the path of one image we investigated from the Smithsonian that propagated across several sites and within Pinterest itself. A Pinterest user desiring to find the original would have to traverse many levels before reaching her goal.

Level 1: Smithsonian adds the image to their online digital collection.

Level 2: Flickr Commons: The Smithsonian adds the image to their Flickr photostream.

Level 3: The design blog apartmenttherapy.com posts copy of the digital photo to their article "historic interiors."

Level 4: A Pinterest user adds the copy from the blog apartmenttherapy.com to their personal pinboard. Represented here are also copies of the pin from other other sources and repins within Pinterest.

Figure 3. Graph of Pinterest sources for a unique image that may be copied several times across the social web.

 

In a collection of 50 Library of Congress and Smithsonian pins we downloaded, we counted a total of one comment, 19 likes, and 150 repins. While this does not match the level of interaction found in the most popular pins, it shows at least 170 social interactions that Pinterest users have had with these LAM materials on this site. In the popular dataset we found three unique pins with the Smithsonian's domain, si.edu, and no pins with the Library of Congress' domain loc.gov. One pin was from the Library's Flickr.com photostream. No pins were from the Smithsonian's Flickr photostream, however three pins were added from the Smithsonian magazine website.

 

Limitations

This work looks at subsets of pins and pinboards on Pinterest. Given our methods, we make no claim that our data is representative of the Pinterest community as a whole. Nevertheless we feel this research shows interesting social curating behaviors, and points to future research and development for the benefit of digital libraries and user/patron collecting tools.

 

Conclusion

Web users curate social collections in ways made possible only by recent innovations in Web 2.0 technologies. Previous research shows tools created by LAM are not popular among website visitors (Filippini-Fantoni & Bowen, 2007). Tools provided by Pinterest.com make it easy for a web user to add content without interrupting their information seeking process, fulfilling Marshall and Bly's (2004) suggestion to digital librarians that "there is opportunity for innovation and refinement in ... the facilities we give readers for interacting with interesting material." Implementing techniques used by Pinterest may help spur user adoption of social and curating tools created for LAM websites.

This work is part of a larger research project investigating collecting and sharing on social sites. Already some LAM have an organizational presence on Pinterest, following the Smithsonian '"go where they are'' approach (Kalfatovic, et al., 2009). The variety of sources is an opportunity for digital libraries to expand their collections with online-only resources identified by the Pinterest community, while the low overlap of board names with LCSH and curation practices of Pinterest users provides curatorial insight for LAM, ELIS, and personal information management. Ad hoc categories we observed could inform user-centered indexing practices in digital libraries. The level of social interaction shown by repins, likes, and comments implies substantial interest in resources on Pinterest. We recommend further study of Pinterest by LAM professionals who are creating Web 2.0 tools and social media strategies.

 

Acknowledgements

The authors extend a special thanks to Xia Lin, Andrea Forte, and Joan Beaudoin for their timely feedback and advice. Doctoral studies of both authors have been supported by IMLS research fellowships.

 

References

[1] Barsalou, L. W. (1983). Ad hoc categories. Memory & Cognition, 11(3), 211—227.

[2] Experian Marketing Services. (2012). The 2012 Digital Marketer Trend and Benchmark Report.

[3] Fidel, R. (1994). User-centered indexing. Journal of the American Society for Information Science, 45(8), 572—576.

[4] Filippini-Fantoni, S., & Bowen, J. (2007). Bookmarking in museums: extending the museum experience beyond the visit? Museums and the Web (Vol. 7).

[5] Guy, M., & Tonkin, E. (2006). Folksonomies: Tidying up tags? D-Lib Magazine, 12(1). http://dx.doi.org/10.1045/january2006-guy

[6] Heymann, P., & Garcia-Molina, H. (2009). Contrasting Controlled Vocabulary and Tagging. Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM 2009. Presented at the WSDM.

[7] Jones, W. (2007). Personal Information Management. Annual Review of Information Science and Technology, 41(1), 453—504. http://dx.doi.org/10.1002/aris.2007.1440410117

[8] Kalfatovic, M. R., Kapsalis, E., Spiess, K. P., Van Camp, A., & Edson, M. (2009). Smithsonian team Flickr: A library, archives, and museums collaboration in web 2.0 space. Archival Science, 8(4), 267—277.

[9] Lesk, M. (1997). Practical digital libraries: Books, bytes, and bucks. Morgan Kaufmann.

[10] Marshall, C. C., & Bly, S. (2004). Sharing encountered information: digital libraries get a social life. Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries. (pp. 218—227). IEEE.

[11] Marshall, C. C., & Shipman, F. M. (2011). The ownership and reuse of visual media. Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries (pp. 157—166). ACM.

[12] Marty, P. F. (2011). My lost museum: User expectations and motivations for creating personal digital collections on museum websites. Library & Information Science Research, 33(3), 211—219. http://dx.doi.org/10.1016/j.lisr.2010.11.003

[13] McKenzie, P. J. (2003). A model of information practices in accounts of everyday-life information seeking. Journal of Documentation, 59(1), 19—40.

[14] Savolainen, R. (1995). Everyday life information seeking: Approaching information seeking in the context of "way of life." Library & Information Science Research, 17(3), 259—294.

[15] Springer, M., Dulabahn, B., Michel, P., Natanson, B., Reser, D., Woodward, D., & Zinkham, H. (2008). For the Common Good: The Library of Congress Flickr Pilot Project. Library of Congress.

[16] Stvilia, B., & Jörgensen, C. (2010). Member activities and quality of tags in a collection of historical photographs in Flickr. Journal of the American Society for Information Science and Technology, 61, 2477—2489. http://dx.doi.org/10.1002/asi.21432

[17] Teevan, J., Cutrell, E., Fisher, D., Drucker, S. M., Ramos, G., André, P., & Hu, C. (2009). Visual snippets: summarizing web pages for search and revisitation. Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI '09 (pp. 2023—2032). New York, NY, USA: ACM. http://dx.doi.org/10.1145/1518701.1519008

[18] Trant, J. (2009). Studying social tagging and folksonomy: A review and framework. Journal of Digital Information, 10(1).

 

About the Authors

Photo of Michael Zarro

Michael Zarro is a PhD Candidate in Information Science at the iSchool at Drexel University. His research interests include exploratory search, human-computer interaction, and health informatics. Prior to entering the PhD program, Michael earned an MS in Library and Information Science at Drexel and worked as an Information Architect in e-commerce and online learning.

 
Photo of Catherine Hall

Catherine Hall is a PhD Candidate in Information Science at the iSchool at Drexel University. Her research interests include digital libraries, metadata and Next Generation catalogs. Prior to joining the PhD program, Catherine obtained an M.A in Information and Library Management from Loughborough University (UK) and worked in academic and special libraries.

 
transparent image