recursion - Functionally split a string by whitespace, group by quotes! -
writing idiomatic functional code, in clojure[1], how 1 write function splits string whitespace keeps quoted phrases intact? quick solution of course use regular expressions should possible without them. @ quick glance seems pretty hard! i've written similar in imperative languages i'd see how functional, recursive approach works.
a quick checkout of our function should do:
"hello there!" -> ["hello", "there!"] "'a quoted phrase'" -> ["a quoted phrase"] "'a' 'b' c d" -> ["a", "b", "c", "d"] "'a b' 'c d'" -> ["a b", "c d"] "mid'dle 'quotes not concern me'" -> ["mid'dle", "quotes not concern me"]
i don't mind if spacing changes between quotes (so 1 can use simple splitting whitespace first).
"'lots of spacing' there" -> ["lots of spacing", "there"] ;is ok me
[1] question answered in general level guess functional approach in clojure can translated haskell, ml, etc ease.
here's version returning lazy seq of words / quoted strings:
(defn splitter [s] (lazy-seq (when-let [c (first s)] (cond (character/isspace c) (splitter (rest s)) (= \' c) (let [[w* r*] (split-with #(not= \' %) (rest s))] (if (= \' (first r*)) (cons (apply str w*) (splitter (rest r*))) (cons (apply str w*) nil))) :else (let [[w r] (split-with #(not (character/isspace %)) s)] (cons (apply str w) (splitter r)))))))
a test run:
user> (doseq [x ["hello there!" "'a quoted phrase'" "'a' 'b' c d" "'a b' 'c d'" "mid'dle 'quotes not concern me'" "'lots of spacing' there"]] (prn (splitter x))) ("hello" "there!") ("a quoted phrase") ("a" "b" "c" "d") ("a b" "c d") ("mid'dle" "quotes not concern me") ("lots of spacing" "there") nil
if single quotes in input don't match properly, final opening single quote taken constitute 1 "word":
user> (splitter "'asdf") ("asdf")
update: version in answer edbond's comment, better handling of quote characters inside words:
(defn splitter [s] ((fn step [xys] (lazy-seq (when-let [c (ffirst xys)] (cond (character/isspace c) (step (rest xys)) (= \' c) (let [[w* r*] (split-with (fn [[x y]] (or (not= \' x) (not (or (nil? y) (character/isspace y))))) (rest xys))] (if (= \' (ffirst r*)) (cons (apply str (map first w*)) (step (rest r*))) (cons (apply str (map first w*)) nil))) :else (let [[w r] (split-with (fn [[x y]] (not (character/isspace x))) xys)] (cons (apply str (map first w)) (step r))))))) (partition 2 1 (lazy-cat s [nil]))))
a test run:
user> (doseq [x ["hello there!" "'a quoted phrase'" "'a' 'b' c d" "'a b' 'c d'" "mid'dle 'quotes not concern me'" "'lots of spacing' there" "mid'dle 'quotes no't concern me'" "'asdf"]] (prn (splitter x))) ("hello" "there!") ("a quoted phrase") ("a" "b" "c" "d") ("a b" "c d") ("mid'dle" "quotes not concern me") ("lots of spacing" "there") ("mid'dle" "quotes no't concern me") ("asdf") nil
Comments
Post a Comment