Monday, January 20, 2014

A Cloud over the Internet

Cloud computing could not have existed without the Internet, but it may make Internet history by making the Internet history.

Organizations are rushing to move their data centers to the cloud. Individuals have been using cloud-based services, like social networks, cloud gaming, Google Apps, Netflix, and Aereo. Recently, Amazon introduced WorkSpaces, a comprehensive personal cloud-computing service. The immediate benefits and opportunities that fuel the growth of the cloud are well known. The long-term consequences of cloud computing are less obvious, but a little extrapolation may help us make some educated guesses.

Personal cloud computing takes us back to the days of remote logins with dumb terminals and modems. Like the one-time office computer, the cloud computer does almost all of the work. Like the dumb terminal, a not-so-dumb access device (anything from the latest wearable gadget to a desktop) handles input/output. Input evolved beyond keystrokes and now also includes touch-screen gestures, voice, image, and video. Output evolved from green-on-black characters to multimedia.

When accessing a web page with content from several contributors (advertisers, for example), the page load time depends on several factors: the performance of computers that contribute web-page components, the speed of the Internet connections that transmit these components, and the performance of the computer that assembles and formats the web page for display. By connecting to the Internet through a cloud computer, we bypass the performance limitations of our access device. All bandwidth-hungry communication occurs in the cloud on ultra-fast networks, and almost all computation occurs on a high-performance cloud computer. The access device and its Internet connection just need to be fast enough to process the information streams into and out of the cloud. Beyond that, the performance of the access device hardly matters.

Because of economies of scale, the cloud-enabled net is likely to be a highly centralized system dominated by a small number of extremely large providers of computing and networking. This extreme concentration of infrastructure stands in stark contrast to the original Internet concept, which was designed as a redundant, scalable, and distributed system without a central authority or a single point of failure.

When a cloud provider fails, it disrupts its own customers, and the disruption immediately propagates to the customers' clients. Every large provider is, therefore, a systemic vulnerability with the potential of taking down a large fraction of the world's networked services. Of course, cloud providers are building infrastructure of extremely high reliability with redundant facilities spread around the globe to protect against regional disasters. Unfortunately, facilities of the same provider all have identical vulnerabilities, as they use identical technology and share identical management practices. This is a setup for black-swan events, low-probability large-scale catastrophes.

The Internet is overseen and maintained by a complex international set of authorities. [Wikipedia: Internet Governance] That oversight loses much of its influence when most communication occurs within the cloud. Cloud providers will be tempted to deploy more efficient custom communication technology within their own facilities. After all, standard Internet protocols were designed for heterogeneous networks. Much of that design is not necessary on a network where one entity manages all computing and all communication. Similarly, any two providers may negotiate proprietary communication channels between their facilities. Step by step, the original Internet will be relegated to the edges of the cloud, where access devices connect with cloud computers.

Net neutrality is already on life support. When cloud providers compete on price and performance, they are likely to segment the market. Premium cloud providers are likely to attract high-end services and their customers, relegating the rest to second-tier low-cost providers. Beyond net neutrality, there may be a host of other legal implications when communication moves from public channels to private networks.

When traffic moves to the cloud, telecommunication companies will gradually lose the high-margin retail market of providing organizations and individuals with high-bandwidth point-to-point communication. They will not derive any revenue from traffic between computers within the same cloud facility. The revenue from traffic between cloud facilities will be determined by a wholesale market with customers that have the resources to build and/or acquire their own communication capacity.

The existing telecommunication infrastructure will mostly serve to connect access devices to the cloud over relatively low-bandwidth channels. When TV channels are delivered to the cloud (regardless of technology), users select their channel on the cloud computer. They do not need all channels delivered to the home at all times; one TV channel at a time per device will do. When phones are cloud-enabled, a cloud computer intermediates all communication and provides the functional core of the phone.

Telecommunication companies may still come out ahead as long as the number of access devices keeps growing. Yet, they should at least question whether it would be more profitable to invest in cloud computing instead of ever higher bandwidth to the consumer.

The cloud will continue to grow as long as its unlimited processing power, storage capacity, and communication bandwidth provide new opportunities at irresistible price points. If history is any guide, long-term and low-probability problems at the macro level are unlikely to limit its growth. Even if our extrapolated scenario never completely materializes, the cloud will do much more than increase efficiency and/or lower cost. It will change the fundamental character of the Internet.

Wednesday, January 1, 2014

Market Capitalism and Open Access

Is it feasible to create a self-regulating market for Open Access (OA) journals where competition for money is aligned with the quest for scholarly excellence?

Many proponents of the subscription model argue that a competitive market provides the best assurance for quality. This ignores that the relationship between a strong subscription base and scholarly excellence is tenuous at best. What if we created a market that rewards journals when a university makes its most tangible commitment to scholarly excellence?

While role of journals in actual scholarly communication has diminished, their role in academic career advancement remains as strong than ever. [Paul Krugman: The Facebooking of Economics] The scholarly-journal infrastructure streamlines the screening, comparing, and short-listing of candidates. It enables the gathering of quantitative evidence in support of the hiring decision. Without journals, the work load of search committees would skyrocket. If scholarly journals are the headhunters of the academic-job market, let us compensate them as such.

There are many ways to structure such compensation, but we only need one example to clarify the concept. Consider the following scenario:

  • The new hire submitted a bibliography of 100 papers.
  • The search committee selected 10 of those papers to argue the case in favor of the appointment. This subset consists of 6 papers in subscription journals, 3 papers in the OA journal Theoretical Approaches to Theory (TAT), and 1 paper in the OA journal Practical Applications of Practice (PAP).
  • The university's journal budget is 1% of its budget for faculty salaries. (In reality, that percentage would be much lower.)

Divide the new faculty member's share of the journal budget, 1% of his or her salary, into three portions:

  • (6/10) x 1% = 0.6% of salary to subscription journals,
  • (3/10) x 1% = 0.3% of salary to the journal TAT, and
  • (1/10) x 1% = 0.1% of salary to the journal PAP.

The first portion (0.6%) remains in the journal budget to pay for subscriptions. The second (0.3%) and third (0.1%) portion are, respectively, awarded yearly to the OA journals TAT and PAP. The university adjusts the reward formula every time a promotion committee determines a new list of best papers.

To move beyond a voluntary system, universities should give headhunting rewards only to those journals with whom they have a contractual relationship. Some Gold OA journals are already pursuing institutional-membership deals that eliminate or reduce author page charges (APCs). [BioMed Central] [PeerJ][SpringerOpen] Such memberships are a form of discounting for quantity. Instead, we propose a pay-for-performance contract that eliminates APCs in exchange for headhunting rewards. Before signing such a contract, a university would conduct a due-diligence investigation into the journal. It would assess the publisher's reputation, the journal's editorial board, its refereeing, editing, formatting, and archiving standards, its OA licensing practices, and its level of participation in various abstracting-and-indexing and content-mining services. This step would all but eliminate predatory journals.

Every headhunting reward would enhance the prestige (and the bottom line) of a journal. A reward citing a paper would be a significant recognition of that paper. Such citations might be even more valuable than citations in other papers, thereby creating a strong incentive for institutions to participate in the headhunting system. Nonparticipating institutions would miss out on publicly recognizing the work of their faculty, and their faculty would have to pay APCs. There is no Open Access free ride.

Headhunting rewards create little to no extra work for search committees. Academic libraries are more than capable to perform due diligence, to negotiate the contracts, and to administer the rewards. Our scenario assumed a base percentage of 1%. The actual percentage would be negotiated between universities and publishers. With rewards proportional to salaries, there is a built-in adjustment for inflation, for financial differences between institutions and countries, and for differences in the sizes of various scholarly disciplines.

Scholars retain the right to publish in the venue of their choice. The business models of journals are used when distributing rewards, but this occurs well after the search process has concluded. The headhunting rewards gradually reduce the subscription budget in proportion to the number of papers published in OA journals by the university's faculty. A scholar who wishes to support a brand-new journal should not pay APCs, but lobby his or her university to negotiate a performance-based headhunting contract.

The essence of this proposal is the performance-based contract that exchanges APCs for headhunting rewards. All other details are up for discussion. Every university would be free to develop its own specific performance criteria and reward structures. Over time, we would probably want to converge towards a standard contract.

Headhunting contracts create a competitive market for OA journals. In this market, the distributed and collective wisdom of search/promotion committees defines scholarly excellence and provides the monetary rewards to journals. As a side benefit, this free-market system creates a professionally managed open infrastructure for the scholarly archive.

Monday, December 16, 2013

Beall's Rant

Jeffrey Beall of Beall's list of predatory scholarly publishers recently made some strident arguments against Open Access (OA) in the journal tripleC (ironically, an OA journal). Beall's comments are part of a non-refereed section dedicated to a discussion on OA.

Michael Eisen takes down Beall's opinion piece paragraph by paragraph. Stevan Harnad responds to the highlights/lowlights. Roy Tennant has a short piece on Beall in The Digital Shift.

Beall's takes a distinctly political approach in his attack on OA:
“The OA movement is an anti-corporatist movement that wants to deny the freedom of the press to companies it disagrees with.”
“It is an anti-corporatist, oppressive and negative movement, [...]”
“[...] a neo-colonial attempt to cast scholarly communication policy according to the aspirations of a cliquish minority of European collectivists.”
“[...] mandates set and enforced by an onerous cadre of Soros-funded European autocrats.”
This is the rhetorical style of American extremist right-wing politics that casts every problem as a false choice between freedom and – take your pick – communism or totalitarianism or colonialism or slavery or... European collectivists like George Soros (who became a billionaire by being a free-market capitalist).

For those of us more comfortable with technocratic arguments, politics is not particularly welcome. Yet, we cannot avoid the fact that the OA movement is trying to reform a large socio-economic system. It would be naïve to think that that can be done without political ideology playing a role. But is it really too much to ask to avoid the lowest level of political debate, politics by name-calling?

The system of subscription journals has an internal free-market logic to it that no proposed or existing OA system has been able to replace. In a perfect world, the subscription system uses an economic market to assess the quality of editorial boards and the level of interest in a particular field. Economic viability acts as a referee of sorts, a market-based minimum standard. Some editorial boards deserve the axe for doing poor work. Some fields of study deserve to go out of business for lack of interest. New editorial boards and new fields of study deserve an opportunity to compete. Most of us prefer that these decisions are made by the collective and distributed wisdom of free-market mechanisms.

Unfortunately, the current scholarly-communication marketplace is far from a free market. Journals hardly compete directly with one another. Site licenses perpetuate a paper-era business model that forces universities to buy all content for 100% of the campus community, even those journals that are relevant only to a sliver of the community. Site licenses limit competition between journals, because end users never get to make the price/value trade-offs critical to a functional free market. The Big Deal exacerbates the problem. Far from providing a service, as Beall contends, the Big Deal gives big publishers a platform to launch new journals without competition. Consortial deals are not discounts; they introduce peer networks to make it more difficult to cancel existing subscriptions. [What if Libraries were the Problem?] [Libraries: Paper Tigers in a Digital World]

If Beall believes in the free market, he should support competition from new methods of dissemination, alternative assessment techniques, and new journal business models. Instead, he seems to be motivated more by a desire to hold onto his disrupted job description:
“Now the realm of scholarly communication is being removed from libraries, and a crisis has settled in. Money flows from authors to publishers rather than from libraries to publishers. We've disintermediated libraries and now find that scholarly system isn't working very well.”
In fact, it is the site-license model that reduced the academic library to the easy-to-disintermediate dead-end role of subscription manager. [Where the Puck won't Be] Most librarians are apprehensive about the changes taking place, but they also realize that they must re-interpret traditional library values in light of new technology to ensure long-term survival of their institution.

Thus far, scholarly publishing has been the only type of publishing not disrupted by the Internet. In his seminal work on disruption [The Innovator's Dilemma], Clayton Christensen characterizes the defenders of the status quo in disrupted industries. Like Beall, they are blinded by traditional quality measures, dismiss and/or denigrate innovations, and retreat into a defense of the status quo.

Students, researchers, and the general public deserve a high-quality scholarly-communication system that satisfies basic minimum technological requirements of the 21st century. [Peter Murray-Rust, Why does scholarly publishing give me so much technical grief?] In the last 20 years of the modern Internet, we have witnessed innovation after innovation. Yet, scholarly publishing is still tied to the paper-imitating PDF format and to paper-era business models.

Open Access may not be the only answer [Open Access Doubts], but it may very well be the opportunity that this crisis has to offer. [Annealing the Library] In American political terms, Green Open Access is a public option. It provides free access to author-formatted versions of papers. Thereby, it serves the general public and the scholarly poor. It also serves researchers by providing a platform for experimentation without having to go through onerous access negotiations (for text mining, for example). It also serves as an additional disruptive trigger for free-market reform of the scholarly market. Gold Open Access in all its forms (from PLOS to PEERJ) is a set of business models that deserve a chance to compete on price and quality.

The choice is not between one free-market option and a plot of European collectivists. The real choice is whether to protect a functionally inadequate system or whether to foster an environment of innovation.

Monday, December 2, 2013

Amazon Floods the Information Commons

Amazon is bringing cloud computing to the masses. Any individual with access to a browser now has access to almost unlimited computing power and storage. This may be the moment that marks the official beginning of the end of the desktop computer, which was already on a downward slide because of the rise of notebooks, netbooks, tablets, and smartphones.

For managers of computer labs, this technology eliminates a slew of nitty gritty management problems without good solutions. When a shared computer is idle, do you take action after 5, 10, or 15 minutes? If you wait too long, you annoy users who are waiting for their turn, and you invite unauthorized users to sneak into someone else's session. If you act too soon, you ruin the experience for the current user. Should you immediately log off an idle user or do you lock the screen for a while before logging off? Again, you balance the interests of the current user against those of the next user. Which software do you install where? Installing all software on every computer is usually too expensive. But if each computer in the lab has its own configuration, how do you communicate those differences to the users? The ultimate challenge of the shared computer is how to let students install software that they themselves are developing while keeping the computer relatively secure, usable to others, and free from pirated software.

Amazon has solved all of this and more. With cloud-based computers, there is no such thing as an idle computer, only idle screens. Shutting down a screen and turning it over to another user does not ruin a session in progress. It is more like turning over a printer. The cloud-based personal computer is configured for one user according to his or her requirements. Students and faculty can install whatever software they need, including their own research software. As to the usual suite of standard applications, cloud services like Adobe Creative Cloud, Google Apps, and Windows Azure have eliminated software installation and maintenance entirely.

The potential of cloud computing in the Information Commons is more than substituting one technology with another. Students and faculty suddenly have their own custom computing laboratory with an unlimited number of computers over which they have complete control. One can imagine projects in which cloud-based computers harvest measurements from sensors across the globe (weather-related, for example), read and analyze the news, and data mine social networks. All of this data can then be fed to high-performance servers running research software for analysis and visualization.

Currently, retail pricing for a cloud-based personal computer starts at $35 per month. This is already a very good price point, considering that it eliminates the hardware replacement cycle, software maintenance, security issues, etc. One can also add and drop computers as needed. Moreover, this is a price point established before competitors have even entered the market. 

When computing and storage become relatively inexpensive on-demand commodity services, computing labs are no longer in the business of sharing computing devices, storage, and software; they are in the business of sharing visualization devices. Currently, Information Commons provide large-screen high-resolution monitors attached to a computer. As large-scale, high-performance, big-data projects grow in popularity across many disciplines, there will be increasing demand for more advanced equipment to visualize and render the results. Today's computing labs will morph into advanced visualization labs. They will provide the capacity to use multiple large high-resolution screens. They may provide access to CAVEs (CAVE Automatic Virtual Environment) and/or additive-manufacturing equipment (which includes 3-D printing). The support requirements for such equipment are radically different from those for current computer labs. CAVEs need large rooms with no windows, multiple projectors, and a sound system. Additive manufacturing may be loud and may require specialized venting systems.

For managers of Information Commons, it is not too early to start planning for this transition. They may look forward to getting rid of the nitty-gritty unsolvable problems mentioned above, but integrating these technologies into the real estate currently used for computing labs and libraries will require all of the organizational and management skills they can muster.

Tuesday, November 5, 2013

Cartoon Physics

When Wile E. Coyote runs off a cliff, he starts falling only after he realizes the precariousness of his situation.

In real life, cartoon physics is decidedly less funny. Market bubbles arise when a trend continues far past the point where the fundamentals make sense. The bubble bursts when the collective wisdom of the market acts on a reality that should have been obvious much earlier. Because of this unnecessary delay, bubbles inflict much unnecessary damage. We saw it recently with the Internet and mortgage bubbles, but the phenomenon is as old as the tulip bubble of 1637.

We also see cartoon physics in action at less epic scales. Cartoon physics applies to almost any disruptive technology. The established players almost never adapt to the new reality when fundamentals require it or when it is logical to do so. Instead of preparing for a viable future, they fight a losing battle hanging onto the past. Most recently, Blackberry ignored the iPhone thinking its serious corporate clients would not be lured by its gadgetry. There is a long line of disrupted industries whose leadership ignored upstart competitors and new realities. This has been the topic of acclaimed academic studies and popularized in every possible venue.

The blame game is a significant part of the process. The recording industry blamed pirates for destroying the music business. In fact, their own neglect to adapt to a digital age contributed at least as much to the disruption.

The scenario is well known, by now too cliché to be a good movie. Leaders of industries in upheaval should know the playbook. Yet, they keep repeating the mistakes of their disrupted predecessors.

Wile E. Coyote finally learned his lesson and decided to stop looking down.

PS: Cartoon physics does not apply to academic institutions, which are protected by their importance and seriousness.

Wednesday, October 9, 2013

Where the Puck won't be

“I skate to where the puck is going to be, not where it has been.”

The academic library has, by default, tied its destiny to a service with no realistic prospects of long-term survival. It has become a systems integrator that stitches together outsourced components into a digital recreation of a paper-based library. This horseless carriage provides the same commodity service to an undergraduate student majoring in chemistry, a graduate student in economics, and a professor of literature. Because it overwhelms the library's budget, organizational structure, and decision-making processes, this expensive and inefficient service hampers innovation in areas that are the library's best hope for survival.

A paper-based library gradually builds a collection of ever-increasing value, and its overhead builds permanent infrastructure. Its digital recreation never builds lasting value. It is a maintenance service, and its overhead is pure inefficiency. This overhead, duplicated at thousands of universities, starts with the costs of preparing for and conducting near-futile site-license negotiations. To shave off a point here and there, the library spends countless staff hours on usage surveys, faculty discussions, consortium meetings, and negotiations with publishers and their middlemen. But the game is rigged. If 15% of a campus wants Journal A, 15% competing Journal B, 10% wants both A and B, and the rest wants neither, the library is effectively forced to rent both A and B for 100% of the campus. This is why scholarly publishers were able to raise prices at super-inflationary rates during a time when all other publishers faced catastrophic disruption. After conducting expensive negotiations, after paying inflated prices, the library must still pay for, build, and maintain the platform that protects publishers' interests by keeping unwanted users out.

Many academics and librarians hope that Open Access efforts will provide an exit from this unsustainable path. If successful, Green Open Access will lead to price reductions and journal cancellations. Gold Open Access seeks to replace site licenses with author page charges. Either strategy reduces the efficiency of library-mediated digital lending by spreading its fixed overhead costs over fewer and/or less expensive journals. New business models for journals, alternative metrics that give scholarly credibility to unbundled works, and any other innovation that competes with site licenses will reduce efficiency even further. All of these factors hasten the demise of an unsustainable service that is already collapsing under its own weight.

Traditionally, a library adapts in response to changing user behavior, attitude, and opinion. However, the Wayne Gretzky quote became a cliché for a reason. When trends have become obvious and users have moved on, it is too late for strategic restructuring.

At the other extreme, an angel investor bets on someone with a compelling idea, accepts the risk of failure, and is prepared to move on to the next player who knows where the puck will be. The library does not have that luxury. It is an institution, not a venture.

The library must maintain sufficient institutional stability to ensure its archival mission. While Open Access is a given, the service portfolio of the future library is far from settled. We must create budgetary and organizational space for new services. We may not know where precisely the puck will be, but we can still move the team out of a field where there is no game to be played.

When canceling site-licensed journals today, the only legally available alternatives are individual subscriptions, pay per view, and self-archived versions of individual papers. This stands in stark contrast with the digital-entertainment universe, where there is a competitive market for providers of personal digital libraries. Services like Apple ITunes, Google Play, Amazon Kindle and Prime, Netflix, Pandora, Spotify, etc. compete on the basis of price, content, usability, convenience, and features. There are many scholarly-communication organizations that could launch analogous services. Within months, Thomson Reuters, EBSCO, publisher alliances, scholarly societies, and even some research libraries could provide a wide selection of options. This will never happen without starving publishers of site-license revenue. Instead of subsidizing publishers, subsidize students and faculty. They are quite capable to choose for themselves what information services they need. After a messy, but short, transition, a competitive market will blossom.

The only thing more terrifying than phasing out a core service is the prospect of outside forces triggering a sudden disruption. Libraries have the choice to disrupt or to be disrupted, to organize their own restructuring or to be restructured by a crisis manager. This is the perfect time to redirect resources away from digital-lending overhead and towards building a scalable, robust, and permanent infrastructure of open scholarly information (refereed papers, technical reports, lab reports, and supporting data). Björn Brembs wants to go even further; he wants libraries to take over all of scholarly communication.

We do not have to wait for Open Access to work its disruptive magic, which may or may not happen at some undetermined time. By forcing the disruption, the rationale for Green Open Access becomes much more straightforward: It creates a permanent public archive of culturally important content that is now controlled by private companies. As a public option to the publishers' walled garden, it may help keep prices in check. That role is much less important, however, when prices are set in a truly competitive market.

Publishers do not think Green Open Access has the power disrupt. They believe they can compensate lower revenue from Gold Open Access by increasing the number of papers they publish. Should site licenses be disrupted anyway, publishers stand ready to compete with libraries.

Publishers are well prepared for any scenario.

Is your library?

Tuesday, August 6, 2013

The Empire Strikes Back

Publishers may soon compete with libraries. The business case for enticing users away from library-managed portals is simple, compelling, and growing. As funding agencies and universities enact Open Access (OA) mandates and publishers transition their journals from the site-license model to the Gold OA model, libraries will cease to be the spigots through which money streams from universities to publishers. In the Gold-OA world, the publishers' core business is developing relationships with scholars, not librarians. For publishers, it makes perfect sense to cater to scholars both as authors and readers.


Current direct-to-scholar portals provided by publishers do not live up to their potential. Each portal is limited to content from just one publisher. Without interoperability, each publisher portal is an island. Only scholars covered by a site license can afford to use them, and those scholars have access to a gateway for all site-licensed content irrespective of publisher: their library web site. In spite of these near-fatal flaws, publishers invest heavily in their direct-to-scholar portals.


These portals are opportunities for future growth. The model is well established: Thomson Reuters' Westlaw is the de-facto standard for legal research in the US, and it is able to command premium pricing for structured public-domain information. It may take a long time for scholarly publishers to duplicate Westlaw's success. Yet, even without access fees, publishers might be able to unlock significant marketing and business-intelligence value from their systems. Knowledge from managing the publishing process combined with usage data from their portals will give publishers unprecedented insight into every aspect of scholars' professional lives in education, research, and development.


For publishers, the transition to Gold OA is rather tricky. They hope to maintain their current level of revenue while replacing the income stream from site licenses with an equivalent income stream from author page charges. This goal, implausible just a few years ago, now seems realistically within their grasp. The outcome remains far from certain, and publishers are hedging their bets by fighting Green OA and lobbying hard for embargo periods. As long as site-license revenue is their main source of revenue, publishers cannot afford to compete with libraries and journal aggregators, their current customers and partners. This calculation will change when Gold OA reaches a certain critical point. This is the context of proposals like CHORUS, an attempt to take over Green OA, and Elsevier's acquisition of Mendeley, a brilliant social-network interface for scholarly content.


Publishers, indexing services, journal aggregators, startups, some nonprofit organizations, and library-system vendors all have expertise to produce compelling post-OA services. However, publishers only need to protect their Gold OA income, and any new revenue streams are just icing on the cake. All others need a reasonable expectation of new revenue to develop new services. This sets the stage for a significant consolidation of the scholarly-communication industry into the hands of publishers.


As soon as the Gold OA shock hits, academic libraries must be ready to engage publishers as competitors. When site licenses disappear, there is no more journal-collection development, and digital lending of journals disappears as a core service. This is a time that requires major strategic decisions from leaders in academia. With its recently released new mission statement, the Harvard Library seems to pave the way: “The Harvard Library advances scholarship and teaching by committing itself to the creation, application, preservation and dissemination of knowledge.” The future of the academic library will be implemented on these pillars. While the revised mission statement necessarily lacks specifics, it is crystal-clear in what it omits: collection development.