PitchBook in WRDS

Penn researchers have access to PitchBook, a popular private capital markets resource, via its native interface. However, Individual users are subject to download limits of 10 daily/25 monthly rows of company, deal, or fund data, as well as 10 daily/25 monthly rows of people data. For researchers needing more downloads, Wharton Research Data Services, or WRDS, offers PitchBook private equity, venture capital, and additional private placements data.

image of PitchBook landing page on the Wharton Research Data Services (WRDS) platform

PitchBook data available in WRDS covers all aspects of private capital raising, featuring more than 1.6 million deals, 3 million companies including more than 660,000 private companies, 31,000 funds, and 300,000 investors. Deals provide a transaction-level view, fund data focuses on each fund held by an investment firm, and the company and investor data is presented at the business entity level. Though much of the data in PitchBook reflects only the most recent update available, also referred to as “header” data, more comprehensive data begins in the late 1990s/early 2000s, with 1% of historical coverage prior to 1992 increasing to 5% by 2000 and 10% by 2004. WRDS is developing snapshot integration that will begin to create a “history” of PitchBook header data.

Datasets from PitchBook are organized on the WRDS platform as follows: Venture Capital North America, Venture Capital Rest of World, Private Equity North America, Private Equity Rest of World, Other Data North America, and Other Data Rest of World. Within each dataset, the following queries are available: Company, Deal, Fund, and Investor. Company variables include ticker and Central Index Key (CIK) identifiers, industry descriptions and codes, and financing details. Deal variables include dates of deal announcement and completion, amount of capital invested, deal status, percentage of stake acquired, transaction categorization, detailed transaction summary, and valuation measures. If applicable, IPO details are available. CEO biographies and education summaries are also retrieved through a Deal query.

Variables in the Fund query include fund preferences, a breakout table of the fund’s direct investments, and whether the fund received investment capital from a Small Business Investment Company, a privately owned and managed investment fund licensed and regulated by the U.S. Small Business Administration. Investor query variables include investor and summary-level fund details, investment activity, and aspects of investor targets.

In addition to the four primary queries, there are 15 “relation” queries, each focusing on two categories of variables. Researchers interested in a list of a particular company’s competitors will be interested in the Company Competitor Relation query, while those examining deals entered into by certain investors will find the Deal Investor Relation query useful. There are even relation queries for deals and their tranches and deals and their beneficiaries.

From a query page, use the Data Preview tab to view the first ten rows of data for selected variables and conditional statements from the chosen data table. From the Data Preview page, the first 1,000 rows of data can be downloaded directly in xslx, api, or json format. Alternatively, click the Access All Data button on the Data Preview page to open SAS/Studio at WRDS, a web-based version of SAS, to access all data.

PitchBook documentation includes a spec file containing variable definitions, comments, and sample data. The WRDS Overview of Pitchbook page features a visual representation of PitchBook dataset organization, database usage notes, and guidance for linking PitchBook data to other datasets.  

Whether you want to track company growth, explore company financing or deals, or investigate investors, consider using PitchBook in WRDS. Contact Lippincott Library for assistance.

What’s Your Company’s RQ™ (Research Quotient)?

Research and Development (R&D) expenditure is the amount of money a company spends on developing new products and services each year.  Academic business researchers have intensively investigated the relationship between a company’s R&D and its market value, and have searched for ways to derive a firm’s optimal R&D spending.  A recent innovation in the analysis and measurement of a firm’s R&D has been the development of the concept of Research Quotient (RQ).

A company’s Research Quotient is the percentage increase in the company’s revenue from a 1% increase in its R&D. RQ is a measure of a firm’s ability to generate revenue from its R&D expenditures.  RQ is calculated from a formula that combines a company’s measure of capital, labor and R&D. For more details concerning RQ calculation, click on Manuals and Overviews from the WRDS Research Quotient database.

RQ can be used:

  • To Link R&D spending to firm growth
  • Link R&D spending to market value
  • Derive a firm’s optimal R&D spending

The WRDS RQ database includes RQ measures for all companies in the COMPUSTAT database that report R&D expenditures. The data covers 1972 to 2010 and is updated annually.  The file allows searching by 4 digit SIC and by GV Key (COMPUSTAT’s unique company identifier).

Table 1 is an example of the output showing some of the default variables.


  • “Raw RQ” is the “Research Quotient” that identifies the ability of a firm to generate revenue from its R&D expenditures. The higher the RQ the greater the revenue generated.
  • “RSTAR” is a calculation of optimal R&D expenditure.
  • “RD Ratio” is the ratio of R&D expenditure to Revenue.

In Table 2, for clarification, I have supplied tickers and names of companies together with a measure of “RQ” that I calculated from the “Raw RQ” supplied by WRDS. This RQ is analogous to the human IQ measure with a mean of 100 and a standard deviation of 15. An RQ with a mean of 100 is often used by academic writers as a way of making the RQ measure more intuitive.

Table 2 ranks the first 20 companies in the U.S. by their RQ in 2010.


There are more than 260 four digit SIC codes represented in the 2010 files, but only 10 codes have more than 35 companies. Table 3 collapses the codes into 2 digits, and ranks the average RQ of the largest 15 industry groups.


About 78% of the companies in the 2010 file were based in the U.S.  Figure 1 graphs the countries with 3 or more companies in 2010 by average R&D expenditures and average RQ.


The principal developer of the RQ concept is Anne Marie Knott, Professor of Business at Washington University, St. Louis. In a 2012 article in Harvard Business Review, she estimateds that if the 20 largest US firms had optimized their R&D expenditure in 2010, they would increase their aggregate market capitalization by $1 trillion. (Knott, Anne Marie. “The Trillion-Dollar R&D Fix.” Harvard Business Review (90:5) 2012, pp. 76-82.) This article can be accessed using Business Source Complete.

Executive Compensation: The Clown Makes A Good Argument

Executive compensation is composed of salary, bonuses, stock options, and other company benefits. Staggering figures like the CEO-to-worker pay ratio of 354:1 (in 2012) have brought executive compensation under some scrutiny in the United States. Other provisions like the ‘say on pay’ provision of the Dodd-Frank Act have brought executive compensation to the forethought of many shareholders’ minds.

dilbert ceo pay

Let’s start with a quick exercise. Which of the following CEOs had the highest base salary for the year 2012?

A. Larry Page (Google)
B. Alex Gorsky (Johnson & Johnson)
C. W. James McNerney, Jr. (Boeing)
D. C. Douglas Mcmillon (Walmart)

I’m not sure who you guessed (the answer is C), but we can quickly find out answers to questions like this using several library databases. A number of publications provide lists of the top paid CEOs, like Forbes’ list of America’s Highest Paid Chief Executives. Lists are helpful, but you may want to search by company or executive or create a time series of data.

Click to Expand

Click to Expand

LexisNexis Academic allows you to search within the Morningstar US Executive Compensation database. This source provides information on salaries, cash compensation, option grants, other stock-related compensation and auditor fees for U.S. public company directors and officers. Data comes from the Form 10-K or Annual Meeting Proxy Statements. The coverage is the current edition (i.e. FY 2013) and does not include historical data. Click Search by Content Type and select Company Profiles. Under the Advanced Options area select the source. Then search by company name (e.g. Apple Inc.) or by executive (e.g. Larry Page). This database is helpful if you are searching for a single company or executive.

Standard & Poor’s Execucomp database is available through WRDS (for Wharton account holders). Go to COMPUSTAT and select Execucomp. This database covers 2,872 companies, both listed and unlisted, with data for up to 9 executives, although most companies only report 5. Similar to the Morningstar database, this data is collected from each company’s annual proxy (DEF14A SEC form). With data back to 1992 and numerous fields to select from (e.g. EIP_UNEARN_NUM — Equity Incentive Plan–Number of Unearned Sha), this is a good database to use to build a time series. Below is an example of Google’s data for the past 5 years, with only Larry Page’s salary shown. Continue reading

AuditAnalytics: Who Audits Whom PLUS

AuditAnalytics (available in WRDS) is a deep, multifaceted database containing data on thousands of auditors and the corporations they audit. Although the file can easily link auditors and the companies they audit, it has many more uses than a simple “Who Audits Whom” database.

audit analytics 1The image on the left shows the many modules available for searching.

Here are two examples of research questions that show off AuditAnalytics’ ability to analyze auditing in depth.

1. Restatements: Identify companies for which the restatement identified fraud.

The SEC can require a company to reissue its financial statements for reasons ranging from clerical errors to fraud. Here’s how to search AuditAnalytics for an answer.

From the list of modules displayed to the left, choose:

Audit and Compliance => Non-Reliance Restatements.

Now choose your search criteria.

Step 1: Enter Date Range (e.g. from 2004 to date)

Step 2: Choose option “Search Entire Database” (Other options here are to search for one or more companies by ticker or CIK number, or to upload a text file of such company identifiers).

Step 3: Choose Variables to display including “Res Fraud” and “Res Fraud Category Title List”

Click on “Manuals and Overviews” from the top of the screen to identify relevant variables and their definitions for each module.  Here, “Res Fraud” indicates that a restatement identified fraud, and “Res Fraud Category Title List” describes the nature of the fraud.

Step 4: Select an output format, then click “Submit Query”. Continue reading

London Stock Price Database on WRDS

This is an edited repost of “LSPD now Available on WRDS” from the University of Manchester Business Library Blog Business Research Plus.

LSPD (The London Share Price Database) is a specialized stock price return database. It can be accessed as a monthly or daily database. It does not have the variety of data available in other databases, such as Datastream or Bloomberg, but concentrates instead on historic stock price coverage and the quality of its returns data.


 LSPD has particular strengths in several areas:

  • Good representative coverage for 1955-1974, full coverage 1975 to date
  • Full historic details of company name changes and Stock Exchange Daily Official List (SEDOL) code changes
  • Details of dividend announcement dates, as well as ex-dividend and payment dates
  • Data on reasons for a company’s delisting

For brief details of LSPD you can expand the screenshot above, or go to the LSPD page on the London Business School website. For full details there is an excellent manual on the WRDS site – recommended reading for anyone thinking of using LSPD.

The Business FAQ covers access to WRDS as well as additional information on finding total returns on stocks.

Play on WRDS: EVENTUS for Event Studies

In use by 300 institutions worldwide, Wharton Research Data Services (WRDS) is a comprehensive source of financial, accounting, economic, management, marketing, banking, and insurance data. WRDS offers access to more than 40 separate databases. Popular databases in WRDS suite include:

COMPUSTAT (detailed historical financial data on the world’s largest public companies) and CRSP (daily stock prices and returns for U.S. companies).

EVENTUS is a program designed for event studies – the examination of the impact of an event on the value of a company. EVENTUS is not a typical WRDS database. Unlike most, it is not a data file, but rather a program that uses the CRSP file as its data source. Examples of event studies might be:

What effect does joining the S&P 500 have on the value of a company’s stock?

Does the announcement of an acquisition typically raise or lower the acquiror’s stock price?

To answer these types of questions, the EVENTUS program requires only a company identifier and the date the event (such as an acquisition) was made public. The program then calculates what is known as “Abnormal Returns” of the stock. This is the actual return minus the expected return if the event had not taken place. The calculation averages the abnormal returns across companies and time periods.

EVENTUS extracts the total return data needed for its calculation from the CRSP stock price database. As company identifiers, it uses CUSIP numbers or PERMNOs (a unique company number assigned by CRSP to each company they cover).

The first few entries of a file for an EVENTUS study might look like this: Continue reading

Where is She?

Women currently make up 46.9% of the labor force in the U.S. according to the Bureau of Labor Statistics, Current Population Survey, Household Data, annual averagesEmployment status of the civilian noninstitutional population 16 years and over by sex, 1972 to date. In fact, 57.7% of all women 16 years and over are in the labor force.

Where can we find out where women are working, what are their occupations, and how much they earn?   A number of resources provide this information.  Below are a selection of resources.

Labor Force Statistics: U.S.

For the United States, the U.S. Bureau of Labor Statistics is a great place to begin.  Statistics are available nationally and by state.  Not only do they provide monthly and quarterly population surveys which include employment by gender, but they also publish the annual survey: Women in the Labor Force: A Databook. This survey provides data on women workers including age, ethnicity, occupation, marital status and children.

A complete listing of U.S. government data can be found here: women and work.

Labor Force Statistics:  International

UNSTATS (United Nations). The section on Gender Info provides statistics for developed and developing nations.  Participation in the Labor Force is available from 1985-2010.

The International Labour Organization, part of the United Nations, provides historical data (1969- ) on employment by sex and economic activity for over 200 countries.  The data is available from LABORSTA (1969-2008) and, more recently, ILOSTAT, (2009- ).

OECD iLibrary provides detailed statistics by gender on employment, occupation and other indicators for developed nations.  Under Key Tables, click on Employment and Labour Markets: Key Tables from OECD.

Professional Women Continue reading