Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Karsten Schmidt
Karsten Schmidt
@toxi@mastodon.thi.ng  ·  activity timestamp 2 months ago

Been updating my personal Mastodon tooling to download and convert my bookmarked toots. Here's how little code is needed to download a single message and convert its HTML content into Markdown, all using these #ThingUmbrella packages:

- https://thi.ng/hiccup: Interop data format (i.e. just nested JS arrays) to encode hierarchical documents
- https://thi.ng/hiccup-html-parse: Parses HTML into hiccup format
- https://thi.ng/hiccup-markdown: Serialize hiccup to Markdown (also includes a Markdown parser to hiccup, but not used here)
- https://thi.ng/zipper: Functional tree editing, manipulation & navigation (here to clean/transform the parsed HTML document)

Edit: Gist version of this example code:
https://gist.github.com/postspectacular/1d7ebdc5a81894c16ab744cb8d25c320

#Mastodon #Markdown #TypeScript #JavaScript #OpenSource

Syntax colored TypeScript source code:

import { parseHtml } from "@thi.ng/hiccup-html-parse";
import { serialize } from "@thi.ng/hiccup-markdown";
import { arrayZipper, type Location } from "@thi.ng/zipper";

// load a Mastodon status via API
const res = await (
	await fetch("https://mastodon.thi.ng/api/v1/statuses/115464108396925195")
).json();

// parse HTML content into thing/hiccup format (nested JS arrays)
const parsed = parseHtml(res.content, {
	whitespace: true,
	ignoreAttribs: ["class"],
}).result!;

// structure of parsed example:
// [["p", {}, "text"], ["p", {}, ...], ...]

// recursively traverse result document/array using thi.ng/zipper
// and replace all <span> elements with their raw text body
let loc: Location<any> | undefined = arrayZipper(parsed);
while (loc) {
	loc = loc.next;
	if (Array.isArray(loc?.node) && loc?.node[0] == "span")
		loc = loc.replace(loc.node[2]);
	if (loc?.next == null) break;
}

// serialize hiccup to markdown
console. log(serialize(loc?.root, null));

/*
Result (in markdown format), omitted here due to alt text limits
*/
Syntax colored TypeScript source code: import { parseHtml } from "@thi.ng/hiccup-html-parse"; import { serialize } from "@thi.ng/hiccup-markdown"; import { arrayZipper, type Location } from "@thi.ng/zipper"; // load a Mastodon status via API const res = await ( await fetch("https://mastodon.thi.ng/api/v1/statuses/115464108396925195") ).json(); // parse HTML content into thing/hiccup format (nested JS arrays) const parsed = parseHtml(res.content, { whitespace: true, ignoreAttribs: ["class"], }).result!; // structure of parsed example: // [["p", {}, "text"], ["p", {}, ...], ...] // recursively traverse result document/array using thi.ng/zipper // and replace all <span> elements with their raw text body let loc: Location<any> | undefined = arrayZipper(parsed); while (loc) { loc = loc.next; if (Array.isArray(loc?.node) && loc?.node[0] == "span") loc = loc.replace(loc.node[2]); if (loc?.next == null) break; } // serialize hiccup to markdown console. log(serialize(loc?.root, null)); /* Result (in markdown format), omitted here due to alt text limits */
Syntax colored TypeScript source code: import { parseHtml } from "@thi.ng/hiccup-html-parse"; import { serialize } from "@thi.ng/hiccup-markdown"; import { arrayZipper, type Location } from "@thi.ng/zipper"; // load a Mastodon status via API const res = await ( await fetch("https://mastodon.thi.ng/api/v1/statuses/115464108396925195") ).json(); // parse HTML content into thing/hiccup format (nested JS arrays) const parsed = parseHtml(res.content, { whitespace: true, ignoreAttribs: ["class"], }).result!; // structure of parsed example: // [["p", {}, "text"], ["p", {}, ...], ...] // recursively traverse result document/array using thi.ng/zipper // and replace all <span> elements with their raw text body let loc: Location<any> | undefined = arrayZipper(parsed); while (loc) { loc = loc.next; if (Array.isArray(loc?.node) && loc?.node[0] == "span") loc = loc.replace(loc.node[2]); if (loc?.next == null) break; } // serialize hiccup to markdown console. log(serialize(loc?.root, null)); /* Result (in markdown format), omitted here due to alt text limits */
  • Copy link
  • Flag this post
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.1-beta.35 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct