Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deletion of a Part breaks reading other Parts through "using" method #1729

Open
Asbjoedt opened this issue May 23, 2024 · 5 comments
Open

Deletion of a Part breaks reading other Parts through "using" method #1729

Asbjoedt opened this issue May 23, 2024 · 5 comments
Assignees

Comments

@Asbjoedt
Copy link
Contributor

Asbjoedt commented May 23, 2024

*Describe the bug
Hello
I am upgrading from v2.20 to v.3.0.2.

The new SDK update breaks code which involves deleting/removing some Open XML Parts in my spreadsheet document.

I am applying "using" to read and write the spreadsheet document. I receive error when trying to read any Part after deletion of some other Parts, when I try to read the Part in a new "using" scope. I receive error "Specified part does not exist in the package".

Opening the spreadsheet document wit Excel or LibreOffice works perfectly fine.

Observed behavior
Open XML SDK perceives the spreadsheet document to be broken with error "Specified part does not exist in the package" whenever I try to read any Part, if some Parts have previously been deleted/removed with the "using" method. Excel renders the spreadsheet document without errors.

Expected behavior

Desktop (please complete the following information):

  • OS: Windows 11
  • Office version: Office Professional 2019
  • .NET Target:
  • DocumentFormat.OpenXml Version: 3.0.2
@mikeebowen
Copy link
Collaborator

Hi @Asbjoedt, could you be more specific about which parts you mean:

I am applying "using" to read and write the spreadsheet document. I receive error when trying to read any Part after deletion of some other Parts, when I try to read the Part in a new "using" scope. I receive error "Specified part does not exist in the package".

When you day "read any Part after deletion of some other Parts", do you mean when you read any Part at all even the WorkbookPart? And which parts do you mean by "deletion of some other Parts"?

@Asbjoedt
Copy link
Contributor Author

When I delete these parts, I get "Specified part does not exist in the package", whenever I try to read any Part again:

  • ImagePart
  • ExternalWorkbookPart
  • CalculationChainPart
  • ConnectionsPart
  • TableDefinitionPart
  • SpreadsheetPrinterSettingsPart
  • VolatileDependenciesPart

@twsouthwick
Copy link
Member

please supply a repro so that we can try it out

@Asbjoedt
Copy link
Contributor Author

Asbjoedt commented Jun 14, 2024

Sample spreadsheet with a data connection
With data connection.xlsx

Use this code to remove a data connection

// Remove data connections
public int Remove_DataConnections(string filepath)
{
    int success = 0;

    using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(filepath, true))
    {
        ConnectionsPart conn = spreadsheet.WorkbookPart.ConnectionsPart;

        // Count connections
        success = conn.Connections.Count();

        // Delete all connections
        spreadsheet.WorkbookPart.DeletePart(conn);

        // Delete all QueryTableParts
        IEnumerable<WorksheetPart> worksheetParts = spreadsheet.WorkbookPart.WorksheetParts;
        foreach (WorksheetPart worksheetPart in worksheetParts)
        {
            // Delete all QueryTableParts in WorksheetParts
            List<QueryTablePart> queryTables = worksheetPart.QueryTableParts.ToList(); // Must be a list
            foreach (QueryTablePart queryTablePart in queryTables)
            {
                worksheetPart.DeletePart(queryTablePart);
            }

            // Delete all QueryTableParts, if they are not registered in a WorksheetPart
            List<TableDefinitionPart> tableDefinitionParts = worksheetPart.TableDefinitionParts.ToList();
            foreach (TableDefinitionPart tableDefinitionPart in tableDefinitionParts)
            {
                List<IdPartPair> idPartPairs = tableDefinitionPart.Parts.ToList();
                foreach (IdPartPair idPartPair in idPartPairs)
                {
                    if (idPartPair.OpenXmlPart.ToString() == "DocumentFormat.OpenXml.Packaging.QueryTablePart")
                    {
                        // Delete QueryTablePart
                        tableDefinitionPart.DeletePart(idPartPair.OpenXmlPart);
                        // The TableDefinitionPart must also be deleted
                        worksheetPart.DeletePart(tableDefinitionPart);
                        // And the reference to the TableDefinitionPart in the WorksheetPart must be deleted
                        List<TablePart> tableParts = worksheetPart.Worksheet.Descendants<TablePart>().ToList();
                        foreach (TablePart tablePart in tableParts)
                        {
                            if (idPartPair.RelationshipId == tablePart.Id)
                                tablePart.Remove();
                        }
                    }
                }
            }
        }

        // If spreadsheet contains a CustomXmlMappingsPart, delete databinding
        if (spreadsheet.WorkbookPart.CustomXmlMappingsPart != null)
        {
            CustomXmlMappingsPart xmlMap = spreadsheet.WorkbookPart.CustomXmlMappingsPart;
            List<Map> maps = xmlMap.MapInfo.Elements<Map>().ToList(); // Must be a list
            foreach (Map map in maps)
            {
                if (map.DataBinding != null)
                    map.DataBinding.Remove();
            }
        }
    }
    return success;
}

Then immediately use this code to read all hyperlink relationships of the same spreadsheet. It should fail with error System.InvalidOperationException: 'Specified part does not exist in the package.'

// Extract all cell hyperlinks to an external file
public int Extract_Hyperlinks(string filepath)
{
    int hyperlinks_count = 0;

    // Read spreadsheet
    using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(filepath, false))
    {
        // Find all hyperlinks
        List<HyperlinkRelationship> hyperlinks = spreadsheet.GetAllParts().SelectMany(p => p.HyperlinkRelationships).ToList();

        // Create metadata file
        string folder = System.IO.Path.GetDirectoryName(filepath);
        using (StreamWriter w = File.AppendText($"{folder}\\orgFile_Metadata.txt"))
        {
            w.WriteLine("---");
            w.WriteLine("EXTRACTED HYPERLINKS");
            w.WriteLine("---");

            foreach (HyperlinkRelationship hyperlink in hyperlinks)
            {
                // Write information to metadata file
                w.WriteLine(hyperlink.Uri);
                // Add to count
                hyperlinks_count++;
            }
        }
    }
    return hyperlinks_count;
}

The exception will be thrown at:

// Find all hyperlinks
List<HyperlinkRelationship> hyperlinks = spreadsheet.GetAllParts().SelectMany(p => p.HyperlinkRelationships).ToList();

@Asbjoedt
Copy link
Contributor Author

Asbjoedt commented Jul 4, 2024

Hi @mikeebowen, @twsouthwick
Have you been able to look any further at the issue?
Can I help to progress issue resolution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants