XLSX File Format
Microsoft Excel Open XML Spreadsheet
Last updated: February 2026
Overview & History
The XLSX format was introduced alongside DOCX in Microsoft Office 2007 as the XML-based successor to the legacy XLS binary spreadsheet format. Like DOCX, XLSX is part of the Office Open XML (OOXML) specification and was developed to provide a more transparent, interoperable format for spreadsheet data. The previous XLS format had been in use since Excel 97 and carried significant technical debt including a 65,536-row limit that was increasingly inadequate for modern data analysis.
The move to XLSX brought transformative improvements. The row limit jumped to over one million rows (1,048,576), columns expanded from 256 to 16,384, and the XML-based structure made files significantly smaller through ZIP compression. These changes reflected the growing importance of spreadsheets in data-intensive workflows. XLSX was standardized as part of ISO/IEC 29500, giving organizations confidence in the format's longevity and interoperability.
Today, XLSX is the most widely used spreadsheet format in business and education. It is supported by Microsoft Excel, Google Sheets, LibreOffice Calc, Apple Numbers, and numerous data analysis tools. For sharing spreadsheet data in a fixed layout, users frequently convert Excel to PDF to ensure recipients see the intended formatting regardless of their spreadsheet software. The format's structured XML internals have also made it popular for data exchange in enterprise systems, where automated tools generate and consume XLSX files at scale.
Technical Overview
Like DOCX, an XLSX file is a ZIP archive containing XML files organized in a specific directory structure. The core content resides in xl/worksheets/ where each sheet is stored as a separate XML file (sheet1.xml, sheet2.xml, etc.). Each worksheet XML contains a sheetData element with rows (r) and cells (c) that store values, formulas, and references to shared resources.
The xl/sharedStrings.xml file implements string deduplication — rather than repeating identical text strings in every cell that uses them, strings are stored once in a shared table and cells reference them by index. This optimization dramatically reduces file size for spreadsheets with repeated text values. Similarly, xl/styles.xml contains all cell formatting definitions (number formats, fonts, fills, borders, alignment) that cells reference by style index.
Formulas are stored as text strings in cell elements and evaluated by the spreadsheet application at runtime. Cell references use A1 notation (column letter + row number) or R1C1 notation. The format supports named ranges, data validation rules, conditional formatting, pivot tables, and charts — each with dedicated XML representations. Charts are stored in xl/charts/ as separate XML files using DrawingML markup. For presentation purposes, converting spreadsheets to PDF rasterizes the calculated cell values and chart graphics into a fixed layout. The format also supports external data connections, macro storage (in .xlsm variant), and VBA project embedding, though these features require careful handling during format conversion.
Pros & Cons
Advantages
- ✓Supports over 1 million rows and 16,000 columns
- ✓Built-in formula engine with 400+ functions
- ✓Efficient ZIP compression for smaller file sizes
- ✓Wide compatibility across spreadsheet applications
- ✓Supports charts, pivot tables, and data visualization
- ✓Structured XML enables programmatic data extraction
Limitations
- ✕Complex spreadsheets may render differently across applications
- ✕Macro-enabled files (.xlsm) pose security risks
- ✕Large files with many formulas can be slow to calculate
- ✕Not all features transfer between different spreadsheet programs
- ✕Binary data and formatting add overhead for simple data storage
Common Uses
- •Financial modeling and budgeting
- •Data analysis and reporting
- •Inventory tracking and management
- •Project planning and scheduling
- •Scientific data collection and analysis
- •Business intelligence dashboards
- •Accounting and bookkeeping
- •Survey data compilation
Related Guides
The Complete Guide to File Formats and Conversion
A comprehensive guide to understanding file formats and converting between them. Covers documents, images, audio, and more.
18 min readPreserving Excel Formatting When Converting to and from PDF
Troubleshoot Excel formatting issues during PDF conversion. Learn how to maintain tables, formulas, and layouts across formats.
12 min readHow OCR Works: Extracting Text from Images and PDFs
Learn how Optical Character Recognition (OCR) technology works and how it enables text extraction from scanned documents and images.
9 min readTechnical Details
- Full Name
- Microsoft Excel Open XML Spreadsheet
- MIME Type
- application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
- Type
- Document
- Compression
- Lossless
- Max File Size
- Unlimited (practical ~1GB)
- Transparency
- No
- Editable
- Yes
- Layers
- No
Related Conversions
Best For
- ✓Spreadsheets with formulas
- ✓Data analysis
- ✓Charts and pivot tables
- ✓Financial modeling
