Custom GPTs: Upload Limits and Privacy (Important)
One of the most powerful things about building custom GPTs is you can upload knowledge you want it to reference when engaged, such as training manuals, product manuals and specifications, internal policies, frequently asked questions (FAQs), industry research and reports, customer service scripts, company history and culture documents, technical whitepapers, business plans, marketing strategies, financial reports and budgets, sales scripts and techniques, etc. While this is incredible, we had questions: - What kind of files can I upload? - How many files can I upload? - What are the limits on these files? - And the #1 question... Does OpenAI use the files I upload to train their models? Here's some answers I was able to gather for us: File Information: - You can upload text documents, spreadsheets, and presentations—pretty much any common file type. - Up to 20 files can be uploaded for each custom GPT. - Documents can be up to 512MB each. For spreadsheets, there's no limit on how much information they contain, but the file size still applies. - Images within documents need to be under 20MB - Each end-user is capped at 10GB. An error will be displayed if a user cap has been hit. - All text and document files uploaded to a GPT are capped at 2 million tokens per file. This limitation does not apply to spreadsheets. What does 2 million tokens mean? On average, one token can be roughly equivalent to around 4 characters or ¾ of a word in English. This is a very rough approximation, as the exact number will vary based on the language and the specific content of your document. Using the average case, 2 million tokens could roughly translate to around 1.5 million words. Again, this is a broad estimate; the actual word count could be higher or lower depending on how tokenization splits your specific text. - Images in Documents: Right now, you can upload documents with images, but GPTs can't process images within documents (ex. an image in the slide of a slide deck), but OpenAI is working on adding this feature.