Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing code point #14

Open
whh1009 opened this issue Jul 27, 2020 · 4 comments
Open

missing code point #14

whh1009 opened this issue Jul 27, 2020 · 4 comments

Comments

@whh1009
Copy link

whh1009 commented Jul 27, 2020

When I use sfntly to extract a subset of fonts, some unicode code points can be obtained correctly, but some are not. I am a little confused, please help to take a look.

public static void main(String[] args) throws Exception{
        String codes = "\\u5e7e\\u8EAB\\ue85d\\ue85e\\u21deb\\u21df8\\u347e\\u347F";
        File srcFontFile = new File("D:\\wanghonghui\\Desktop\\mytest.ttf");
        File disFontFile = new File("D:\\wanghonghui\\Desktop\\test.ttf");
        getSubFont(codes, srcFontFile, disFontFile);
    }

public static void getSubFont(String ucodes, File srcFontFile, File disFontFile) throws Exception{
        long start = System.currentTimeMillis();
        Font font = FontUtils.getFonts(new FileInputStream(srcFontFile))[0];
        Set<Integer> glyphs = new LinkedHashSet<Integer>();
        CMapTable cMapTable = font.getTable(Tag.cmap);
        CMap cmap = cMapTable.cmap(Font.PlatformId.Windows.value(), Font.WindowsEncodingId.UnicodeUCS4.value());
        System.err.println(cmap);
        int glyphId = 0;
        for(String ucode : ucodes.split("\\\\u")) {
            if(StringUtils.isEmpty(ucode)) continue;
            glyphId = cmap.glyphId(Integer.parseInt(ucode, 16));
            if(glyphId != 0) {
                glyphs.add(glyphId);
            } else {
                System.err.println("code:"+ucode+",not found");
            }
        }
        FontFactory fontFactory = FontFactory.getInstance();
        Subsetter subsetter = new RenumberingSubsetter(font, fontFactory);
        List<Integer> glyphList = new ArrayList<Integer>(glyphs);
        subsetter.setGlyphs(glyphList);
        Font newFont = subsetter.subset().build();
        FileOutputStream fos = new FileOutputStream(disFontFile);
        fontFactory.serializeFont(newFont, fos);
        long used = System.currentTimeMillis()-start;
        System.err.println("time: "+used+"ms");
    } ```
@whh1009
Copy link
Author

whh1009 commented Jul 27, 2020

font.zip

this is font zip.

Are there any special requirements for fonts when using sfntly?

@rillig
Copy link
Owner

rillig commented Jan 27, 2022

Hello whh,

String codes = "\u5e7e\u8EAB\ue85d\ue85e\u21deb\u21df8\u347e\u347F";

Some of the \u sequences look as if they contain 5 hex digits, for example \\u21df8. Did you really intend to include the code points U+21DF "DOWNWARDS ARROW WITH DOUBLE STROKE" and U+0038 "DIGIT EIGHT"?

@whh1009
Copy link
Author

whh1009 commented Feb 17, 2022

Thank you very much your reply. \u21df8 is a unicode, which actually corresponds to a Chinese character, please see https://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=21df8&useutf8=true,there will be problems use 5 hex digits.

@rillig
Copy link
Owner

rillig commented Feb 17, 2022

In Java, the character sequence \u21df8 is interpreted as U+21DF followed by U+0038. That's how it is, Java doesn't support \u with more than 4 hexadecimal digits. See JLS 17 sections 3.1 to 3.3.

If you encode your desired code points in UTF-16, this may already solve your problem.

Contrary to Java, Unicode allows 5 or 6 digits when referring to a code point such as U+21DF8. Keep this difference between Unicode and Java in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants