# Book Annotation Parser

A Python tool to parse book annotation HTML files and print them in a simple, readable format.

## Features

- Extracts highlighted text, notes, and chapter information from HTML annotation files
- Prints annotations in a clean, markdown-compatible format
- Displays chapter names as section headers
- Handles HTML parsing errors gracefully
- Command-line interface with argparse for easier usage

## Requirements

- Python 3.6+
- BeautifulSoup4 (`pip install beautifulsoup4`)

## Installation

1. Clone or download this repository
2. Install the required dependencies:

```bash
pip install beautifulsoup4
```

## Usage

### Command Line

```bash
python parse_annotations.py <html_file> [book_name] [author_name]
```

Examples:

```bash
# Basic usage with just the HTML file
python parse_annotations.py Axiomatic.html

# Specify book name and author
python parse_annotations.py Axiomatic.html "Axiomatic" "Greg Egan"
```

The output will be printed to the console in this format:

```
# Book Title
by Author Name

## Chapter Name
> Quoted text from the book
Note text if available

## Another Chapter
> Another quote from the book
Another note

Total annotations: 53
```

### In Python Code

```python
from parse_annotations import parse_and_print_annotations

# Parse and print annotations
annotation_count = parse_and_print_annotations(
    html_file="Axiomatic.html",
    book_name="Axiomatic", 
    author_name="Greg Egan"
)

print(f"Printed {annotation_count} annotations")
```

## Output Format

The annotations are printed in the following format:

```
> quoted text
note (if available)
```

Chapter names are displayed as Markdown level-2 headers `## Chapter Name` when present.

## Redirecting Output

To save the output to a file, you can redirect the console output:

```bash
python parse_annotations.py Axiomatic.html "Axiomatic" "Greg Egan" > axiomatic_notes.md
```

This creates a Markdown file with all your annotations that you can view in any Markdown editor.

## Testing

Run the test script to see the parser in action:

```bash
python test_print.py
```

This will demonstrate the parser's output format with the sample annotation file. 