Navigation Aids

 
 
 
 
 
Languages in the Penn Libraries Collections

Sidebar

Main Content

Languages in the Penn Libraries Collections (2013 update)

What are the languages Penn Library books and other materials use?

Franklin, the Penn Library's online catalog, employs language codes compliant with the ISO 639-2 and ANSI Z39.53 standards managed by the Library Congress. Although Classic Franklin users may limit search results to specific languages, it is not possible for public users to search directly by language code. New Franklin users will enjoy using the Language Facet in searching the Penn Libraries holdings.

This table, using Franklin data extracted on 3 March 2013, counts active titles -- individual bibliographic records -- in Franklin. The counts have been cleaned: for instance, titles identified as "Miscellaneous languages" have been examined and placed under recognizable language names. A future revision of this page will include non-standard language names for some interesting works hidden under "Miscellaneous languages".

Pct of TotalLanguage NameTitles
68.6%English3,178,033
7.5%German347,727
4.9%French226,747
2.8%Spanish128,992
2.2%Chinese101,681
2.1%Italian99,070
1.8%Arabic84,457
1.2%Latin54,521
1.2%Russian53,422
1.1%Japanese49,936
1.1%Hebrew49,038
0.6%Hindi25,627
0.4%Urdu17,785
0.4%Bengali17,660
0.3%Dutch15,736
0.3%Portuguese12,960
0.3%Tamil11,872
0.2%Persian11,544
0.2%Swedish9,694
0.2%Turkish9,352
0.2%Lithuanian9,285
0.2%Polish9,144
0.2%Sanskrit8,365
0.1%Marathi6,735
0.1%Gujarati6,352
0.1%Telugu5,474
0.1%Yiddish4,843
0.1%Danish4,802
0.1%Malayalam4,701
0.1%Tibetan4,090
0.1%Korean3,695
0.1%Greek, Modern (1453-)3,517
0.1%Nepali2,896
0.1%Greek, Ancient (to 1453)2,832
0.1%Sinhalese2,795
0.1%Czech2,737
0.1%Ukrainian2,603
0.0%Catalan1,931
0.0%Panjabi1,707
0.0%Armenian1,527
0.0%Norwegian1,490
0.0%Pushto1,324
0.0%Hungarian1,268
0.0%Finnish1,223
0.0%Welsh1,083
0.0%Rajasthani1,078
0.0%Frisian1,071
0.0%Croatian1,066
0.0%Serbian1,015
0.0%Romanian1,001
0.0%Turkish, Ottoman962
0.0%Swahili827
0.0%Romance (Other)775
0.0%Mongolian764
0.0%Maithili761
0.0%Newari673
0.0%Latvian668
0.0%Sindhi663
0.0%Bulgarian636
0.0%French, Middle (ca. 1300-1600)594
0.0%Amharic492
0.0%Yoruba492
0.0%Kannada462
0.0%Icelandic445
0.0%Irish435
0.0%Indic (Other)416
0.0%French, Old (ca. 842-1300)398
0.0%Konkani389
0.0%English, Middle (1100-1500)384
0.0%Slovak379
0.0%Braj331
0.0%Prakrit languages331
0.0%Pali326
0.0%Bhojpuri299
0.0%Kurdish277
0.0%Slovenian271
0.0%German, Middle High (ca. 1050-1500)263
0.0%Syriac, Modern250
0.0%Mayan languages240
0.0%Lahnda224
0.0%Afrikaans221
0.0%Kazakh215
0.0%Baluchi199
0.0%Estonian195
0.0%Judeo-Arabic194
0.0%Galician183
0.0%Kashmiri183
0.0%Church Slavic175
0.0%Belarusian171
0.0%Raeto-Romance171
0.0%Khasi165
0.0%Sino-Tibetan (Other)154
0.0%Shona130
0.0%Provençal (to 1500)129
0.0%English, Old (ca. 450-1100)127
0.0%Macedonian122
0.0%Ladino121
0.0%Malagasy121
0.0%Azerbaijani119
0.0%Assamese118
0.0%Indonesian117
0.0%Thai115
0.0%Tigrinya107
0.0%Aramaic105
0.0%Occitan (post 1500)105
0.0%Georgian100
0.0%Tagalog100
0.0%Niger-Kordofanian (Other)99
0.0%Akkadian98
0.0%Dravidian (Other)96
0.0%Scots94
0.0%Somali92
0.0%Marwari91
0.0%Central American Indian (Other)89
0.0%Scottish Gaelic87
0.0%Basque84
0.0%Coptic82
0.0%Creoles and Pidgins, French-based (Other)82
0.0%Kinyarwanda81
0.0%Awadhi76
0.0%Bantu (Other)75
0.0%Magahi72
0.0%Ganda69
0.0%Pahlavi68
0.0%Dogri63
0.0%Uzbek63
0.0%Wolof63
0.0%Algonquian (Other)61
0.0%Luo (Kenya and Tanzania)59
0.0%Oriya57
0.0%Romani56
0.0%Nahuatl55
0.0%Sorbian (Other)54
0.0%Egyptian53
0.0%Austronesian (Other)51
0.0%Manipuri48
0.0%Sotho47
0.0%South American Indian (Other)47
0.0%Vietnamese47
0.0%Germanic (Other)45
0.0%Igbo45
0.0%Samaritan Aramaic44
0.0%Albanian43
0.0%Berber (Other)42
0.0%Lushai42
0.0%Slavic (Other)42
0.0%Zulu42
0.0%Esperanto40
0.0%Malay39
0.0%Quechua39
0.0%Fula38
0.0%Hausa37
0.0%North American Indian (Other)37
0.0%Cree36
0.0%Dakota36
0.0%Ethiopic36
0.0%Kikuyu35
0.0%Western Pahari languages35
0.0%Nilo-Saharan (Other)34
0.0%Ndebele (Zimbabwe)32
0.0%Tswana32
0.0%Nyanja31
0.0%Dutch, Middle (ca. 1050-1350)30
0.0%Javanese29
0.0%Avestan28
0.0%Oromo28
0.0%Tajik28
0.0%Mandingo26
0.0%Ndonga26
0.0%Athapascan (Other)25
0.0%Breton25
0.0%Gothic25
0.0%Ojibwa23
0.0%Munda (Other)22
0.0%Sumerian22
0.0%German, Old High (ca. 750-1050)21
0.0%Papuan (Other)21
0.0%Altaic (Other)20
0.0%Finno-Ugrian (Other)20
0.0%Mohawk20
0.0%Bambara19
0.0%Bemba19
0.0%Creoles and Pidgins (Other)19
0.0%Burmese18
0.0%Lao18
0.0%Turkmen18
0.0%Hawaiian17
0.0%Sami17
0.0%Creoles and Pidgins, Portuguese-based (Other)15
0.0%Dyula15
0.0%Samoan15
0.0%Tatar15
0.0%Zapotec15
0.0%Afroasiatic (Other)14
0.0%Inuktitut14
0.0%Iranian (Other)14
0.0%Mooré14
0.0%Aymara13
0.0%Delaware13
0.0%Hiligaynon13
0.0%Iroquoian (Other)13
0.0%Khmer13
0.0%Navajo13
0.0%Manx12
0.0%Uighur12
0.0%Xhosa12
0.0%Balinese11
0.0%Bosnian11
0.0%Kurukh11
0.0%Micmac11
0.0%Rundi11
0.0%Twi11
0.0%Ugaritic11
0.0%Bihari (Other)10
0.0%Indo-European (Other)10
0.0%Judeo-Persian10
0.0%Low German10
0.0%Miscellaneous languages10
0.0%Cherokee9
0.0%Khoisan (Other)9
0.0%Nyankole9
0.0%Sundanese9
0.0%Swazi9
0.0%Tigré9
0.0%Chechen8
0.0%Creek8
0.0%8
0.0%Guarani8
0.0%Kalâtdlisut8
0.0%Kuanyama8
0.0%Kyrgyz8
0.0%Shan8
0.0%Chagatai7
0.0%Lingala7
0.0%Mapuche7
0.0%Otomian languages7
0.0%Tamashek7
0.0%Apache languages6
0.0%Australian languages6
0.0%Caucasian (Other)6
0.0%Celtic (Other)6
0.0%Kamba6
0.0%Kongo6
0.0%Lozi6
0.0%Tonga (Nyasa)6
0.0%Bashkir5
0.0%Choctaw5
0.0%Chuvash5
0.0%Dzongkha5
0.0%Faroese5
0.0%Kabyle5
0.0%Kru (Other)5
0.0%Maltese5
0.0%Maori5
0.0%Nubian languages5
0.0%Santali5
0.0%Semitic (Other)5
0.0%Sicilian Italian5
0.0%Sogdian5
0.0%Songhai5
0.0%Acoli4
0.0%Afar4
0.0%Angika4
0.0%Arawak4
0.0%Bikol4
0.0%Duala4
0.0%Ewe4
0.0%Gilbertese4
0.0%Haitian French Creole4
0.0%Iloko4
0.0%Manchu4
0.0%Niuean4
0.0%Northern Sotho4
0.0%Norwegian (Bokmål)4
0.0%Old Persian (ca. 600-400 B.C.)4
0.0%Papiamento4
0.0%Rarotongan4
0.0%Salishan languages4
0.0%Venda4
0.0%Ainu3
0.0%Avaric3
0.0%Chinook jargon3
0.0%Creoles and Pidgins, English-based (Other)3
0.0%Cushitic (Other)3
0.0%Dinka3
0.0%Fang3
0.0%Grebo3
0.0%Herero3
0.0%Hmong3
0.0%Kara-Kalpak3
0.0%Kawi3
0.0%Mongo-Nkundu3
0.0%Neapolitan Italian3
0.0%Palauan3
0.0%Philippine (Other)3
0.0%Pohnpeian3
0.0%Sardinian3
0.0%Serer3
0.0%Tahitian3
0.0%Tuvinian3
0.0%Yupik languages3
0.0%Artificial (Other)2
0.0%Bable2
0.0%Banda languages2
0.0%Buriat2
0.0%Carib2
0.0%Corsican2
0.0%Efik2
0.0%Elamite2
0.0%Fanti2
0.0%Fijian2
0.0%Gondi2
0.0%Karen languages2
0.0%Kosraean2
0.0%Luba-Katanga2
0.0%Mari2
0.0%Mon-Khmer (Other)2
0.0%Nyoro2
0.0%Old Norse2
0.0%Sango (Ubangi Creole)2
0.0%Siksika2
0.0%Siouan (Other)2
0.0%Tsonga2
0.0%Wolayta2
0.0%Zaza2
0.0%Zuni2
0.0%Abkhaz1
0.0%Achinese1
0.0%Akan1
0.0%Aleut1
0.0%Altai1
0.0%Arapaho1
0.0%Bamileke languages1
0.0%Basa1
0.0%Batak1
0.0%Bislama1
0.0%Bugis1
0.0%Caddo1
0.0%Chibcha1
0.0%Cornish1
0.0%Dayak1
0.0%Ewondo1
0.0%Gayo1
0.0%Gwich'in1
0.0%Hiri Motu1
0.0%Hupa1
0.0%Iban1
0.0%Inupiaq1
0.0%Irish, Middle (ca. 1100-1550)1
0.0%Kabardian1
0.0%Luba-Lulua1
0.0%Maasai1
0.0%Madurese1
0.0%Mende1
0.0%Minangkabau1
0.0%Nauru1
0.0%Norwegian (Nynorsk)1
0.0%Nzima1
0.0%Ossetic1
0.0%Pampanga1
0.0%Pangasinan1
0.0%Selkup1
0.0%Sign languages1
0.0%Soninke1
0.0%Sukuma1
0.0%Syriac1
0.0%Tai (Other)1
0.0%Terena1
0.0%Tok Pisin1
0.0%Tsimshian1
0.0%Tumbuka1
0.0%Udmurt1
0.0%Wakashan languages1
0.0%Walloon1
0.0%Washoe1
97.5%Total: Single-language titles4,630,507
0.1%Multiple languages3,946
2.4%Undetermined language, No linguistic content, or Code missing114,594
100.0%Grand Total4,749,047

"Is this everything?"
No. Although this table reports languages of books, journals, videos, sound recordings, and electronic resources in Franklin, you should use the "Percent of Total" column as a guide to the Penn Library collections, rather than the "Titles counted" column. Active or unsuppressed bibliographic records in Franklin may have been missed if they lacked a language code. Items using two or more languages may have been coded for the prominent language or relegated to "Multiple languages", depending upon cataloging practice. This explains why our Klingon translation of Hamlet continues to be cataloged as "English". And, of course, a "single-language" journal may have an article or two in another language!

"It's been nine years. My, how you've grown!"
We produced the first Languages in the Penn Library Collections count on 19 February 2004, and we have now updated that 2004 count three times on a three-year cycle: on 20 February 2007, on 5 January 2010, and this 2013 table. The counted collection has grown in nine years, from 3,010,421 titles (2004) to 3,274,516 titles (2007) and to 3,804,397 titles (2010) and now to 4,749.047 titles (2013). The number of single-language titles has grown, too: from 2,931,066 titles (2004) to 4,630,507 titles (2013). We've added languages, too: we now have 374 languages or language grops represented in Franklin, from only 337 languages in 2004. While the number of titles and the relative proportions of language representation have changed between 2010 and 2013, most of the dramatic growth in English-language titles reflects the hard work of our catalogers on our existing - mostly digital - collections rather than massive acquisitions projects.

"All my Burushaski books! Did you throw them out?"
No, they're still in the catalog and they're still in our stacks. This table uses MARC 21 language codes, as maintained by the Library of Congress for the bibliographic description of information resources. Sparsely-published languages may be grouped into generic categories, such as "Bantu (Other)" or "Yupik languages". Burushaski is grouped under the very large "Miscellaneous languages" code. A subsequent revision of this table will include Burushaski books. The Library of Congress revises its language codes. We try to keep up with the language code revisions, and this table is part of that effort. An interesting discussion of the language coding is provided at Ethnologue's web page, "Three-letter Codes for Identifying Languages". For more information on MARC 21 language codes, see "MARC Code List for Languages" (Library of Congress web).

Mistakes have been made
This table uses MARC 21 language codes appearing in the MARC bibliographic record format's field 008/35-37 "Fixed-Length Data Elements / Language". However, Franklin has been built through decades of cataloging practices, and so the catalog still uses many obsolete codes. When we started counting our languages, "Eskimo Languages" was the preferred term, where now we identify works in Inuktitut, Kalâtdlisut, or Yupik languages. Fossil codes and errors have been re-attributed through examination of individual Franklin records, and this examination is also used to correct and update Franklin records.

*