ios - Decoding word-encoded Content-Disposition header file name in Objective-C -
i trying retrieve file name can't represented in ascii content-disposition header.
this file name word-encoded. below encoded file name:
=?utf-8?q?=c3=abst=c3=a9_=c3=a9_=c3=bam_n=c3=b4m=c3=a9?= =?utf-8?q?_a=c3=a7ent=c3=baad=c3=b5.xlsx?=
how decoded file name (that "ësté é úm nômé açentúadõ.xlsx")?
ps: looking objective-c implementation.
you want search mime handling framework, searched online , came nothing, so....
i couldn't find example online, i'm showing algorithm here. it's not best example since i'm making big assumption. being string utf-8 q-encoded.
q-encoding url-encoding (percent-encoding), foundation's nsstring
has support decoding. (practical) difference when decoding (there bigger differences when encoding) %
encodings =
encodings instead.
then there's lead-in , lead-out stuff. each encoded block has format =?charset-name?encoding-type? ... encoded string here ... ?=
. should read charset name use encoding, , should read encoding-type, since may "q" or "b" (base64).
this example works q-encoding (a subset of quoted-printable). should able modify handle different charsets , handle base64 encoding however.
#import <foundation/foundation.h> int main(void) { nsautoreleasepool *pool = [[nsautoreleasepool alloc] init]; nsstring *encodedstring = @"=?utf-8?q?=c3=abst=c3=a9_=c3=a9_=c3=bam_n=c3=b4m=c3=a9?= =?utf-8?q?_a=c3=a7ent=c3=baad=c3=b5.xlsx?="; nsscanner *scanner = [nsscanner scannerwithstring:encodedstring]; nsstring *buf = nil; nsmutablestring *decodedstring = [[nsmutablestring alloc] init]; while ([scanner scanstring:@"=?utf-8?q?" intostring:null] || ([scanner scanuptostring:@"=?utf-8?q?" intostring:&buf] && [scanner scanstring:@"=?utf-8?q?" intostring:null])) { if (buf != nil) { [decodedstring appendstring:buf]; } buf = nil; nsstring *encodedrange; if (![scanner scanuptostring:@"?=" intostring:&encodedrange]) { break; // invalid encoding } [scanner scanstring:@"?=" intostring:null]; // skip terminating "?=" // decode encoded portion (naively using utf-8 , assuming q encoded) // i'm doing naively, should work // firstly i'm encoding % signs can cheat , turn url-encoded string, nsstring can decode encodedrange = [encodedrange stringbyreplacingoccurrencesofstring:@"%" withstring:@"=25"]; // turn url-encoded string encodedrange = [encodedrange stringbyreplacingoccurrencesofstring:@"=" withstring:@"%"]; // remove underscores encodedrange = [encodedrange stringbyreplacingoccurrencesofstring:@"_" withstring:@" "]; [decodedstring appendstring:[encodedrange stringbyreplacingpercentescapesusingencoding:nsutf8stringencoding]]; } nslog(@"decoded string = %@", decodedstring); [decodedstring release]; [pool drain]; return 0; }
this outputs:
chrisbook-pro:~ chris$ ./qp-decode 2010-12-01 18:54:42.903 qp-decode[9643:903] decoded string = ësté é úm nômé açentúadõ.xlsx
Comments
Post a Comment