Problem with items with same name

I have a mysql bibliographical table, with author family name, author given name, title etc…
When there are two authors with the same family name (but different given name) they are merged in one author.
The code I use is

    $sql = $pdo->query("
    SELECT a.autore, a.autore_nome, b.autore, b.autore_nome, b.imagelink, b.titolo, b.sigla as 'data SUBSTRING(data,1,4)', b.edizione, b.luogo, b.data, b.trad_titolo, b.trad_edizione, b.trad_luogo, b.trad_data, b.rivista, b.num, b.pagg, b.keywords, b.ambito, b.reperibilita, b.scheda
    FROM bibliografie__autori a
    INNER JOIN bibliografie b
    WHERE a.autore = b.autore AND a.autore_nome = b.autore_nome
    ORDER BY a.autore, b.data
    ");

In Italian, my language, autore is author (I use it as family name), and autore_nome is author given name.

Where the above query is wrong?

I think it is more that the database structure is wrong.
You are duplicating the author names in both tables. The names should be in the bibliografie__autori table only. The bibliografie table should reference the author by its unique ID from the author table.

The query may then look something like:-

    SELECT a.autore, a.autore_nome, b.imagelink, b.titolo, b.sigla as 'data SUBSTRING(data,1,4)', b.edizione, b.luogo, b.data, b.trad_titolo, b.trad_edizione, b.trad_luogo, b.trad_data, b.rivista, b.num, b.pagg, b.keywords, b.ambito, b.reperibilita, b.scheda
    FROM bibliografie__autori a
    INNER JOIN bibliografie b
    WHERE a.autore_id = b.autore_id
    ORDER BY a.autore, b.data
2 Likes

Thank you, but bibliografie has several books for every author, therefore almost every author has several ids, as many as his books, While in bibliografie__autori, a view, not a table, I have only one author with a given name…
The following is the code of the view bibliografie__autori

select `bibliografia`.`bibliografie`.`autore` AS `autore`,
`bibliografia`.`bibliografie`.`autore_nome` AS `autore_nome`,
`bibliografia`.`bibliografie`.`coautore` AS `coautore`,
`bibliografia`.`bibliografie`.`coautore_nome` AS `coautore_nome` 
from `bibliografia`.`bibliografie` 
where `bibliografia`.`bibliografie`.`autore`  not like '%AA.VV.%' 
group by `bibliografia`.`bibliografie`.`autore`,`bibliografia`.`bibliografie`.`autore_nome`

And here lies the problem. An author should have only one ID which is unique to them which is the Primary Key of the author table.
This ID is then referenced in the table of books. Books by the same author will of course reference the same ID. The ID is unique in the author table, but may appear many times in the book table.

The author table may be like:-
autore_id, autore, autore_nome
Plus any other info field you want there.

The book table may be like:-
bibliografie_id, imagelink, titolo, sigla, autore_id
Plus the other fields you have.

2 Likes

Thank you. I will see what I can do. :slightly_smiling_face:

Maybe I was not clear: several ids are not of authors, but of authors’ books (every book, not every author has an unique id).

I tried to add an unique id, but it doesn’t work:

    SELECT a.autore, a.autore_nome, b.autore, b.autore_nome, a.id_autore, b.id_autore, b.imagelink, b.titolo, b.edizione, b.luogo, b.data, b.trad_titolo, b.trad_edizione, b.trad_luogo, b.trad_data, b.rivista, b.num, b.pagg, b.keywords, b.ambito, b.reperibilita, b.scheda
    FROM bibliografie__autori a
    INNER JOIN bibliografie b
    WHERE a.id_autore = b.id_autore
    ORDER BY a.autore, b.data

It doesn’t work:
a) because are displayed only items with (some) id
b) and because that items are always merged under the same family name (even they have different ids)

Not to mention that the duplicate cases are only 3, while add ids to every author would be very long :frowning:

Moreover I noticed that, back to my original code, with the following codes I get the same result

WHERE a.autore = b.autore and WHERE a.autore = b.autore AND a.autore_nome = b.autore_nome

A quite dirty workaround would be add to the few duplicated family names something like ‏ (invisible character).
This works, but is not a good, deep solution, I know…

what does this mean, “merged”?

this suggests an aggregation, like what you would get with GROUP BY

however, you’re using only ORDER BY, and sorting only on the surname

try this –

ORDER BY a.autore, a.autore_nome, b.data
1 Like

I mean that the books of “Vassallo Salvatore” are shown under “Vassallo Nicla”, as they were of “Vassallo Nicla” (same family name), even “Vassallo Salvatore” and “Vassallo Nicla” have two different author ids.

But I think that I should explain better the whole problem. I will do it in a next post.

I guess that I have to explain better my problem.

  1. I have two sources of data (for this query): a table and a view (from that table).
    2a. The table (bibliografie) contains many books, with author, title, publisher, date, etc.
    2b. The view (bibliografie__autori) contains only the authors (and you can see its code at my post #3)

  2. The aim of the query (reported in my first post) is to show all the books of an author “under” that author. And she works, but there is a problem with authors with the same family name, as I said in my first post (#1)

  3. The only workaround is that “dirty” one (adding an invisible character after a famly name). I attach the correct result:

  4. But, this is not a good solution, of course …

  5. The code you suggested, r937 (#8), doesn’t work (=nothing changes)

  6. I guess that an important point would be understand why - as I said in my #6 post, "with the following codes I get the same result:
    WHERE a.autore = b.autore and WHERE a.autore = b.autore AND a.autore_nome = b.autore_nome

  7. Maybe it is a good thing report the whole code, here you are

    $sql = $pdo->query("
    SELECT a.autore, a.autore_nome, b.autore, b.autore_nome, a.id_autore, b.id_autore, b.imagelink, b.titolo, b.sigla as 'data SUBSTRING(data,1,4)', b.edizione, b.luogo, b.data, b.trad_titolo, b.trad_edizione, b.trad_luogo, b.trad_data, b.rivista, b.num, b.pagg, b.keywords, b.ambito, b.reperibilita, b.scheda
    FROM bibliografie__autori a
    INNER JOIN bibliografie b
    WHERE a.autore = b.autore AND a.autore_nome = b.autore_nome
    ORDER BY a.autore, b.data
    ");
    $schedature="biblio/schedature/";
    $autori = $sql->fetchAll(PDO::FETCH_GROUP);
    $posts = array();
 foreach($autori as $autore => $books) {
    echo "
     <div class='left'> 
     <h2>" . $books[0]['autore'] . ", " . $books[0]['autore_nome'] . "  </h2>\n
     ";
     foreach($books as $book) {
         $imageastooltip ="<span class='tooltip'><span class='tooltiptext'><img src='$book[imagelink]' /></span><i>$book[titolo]</i></span>";
        echo "<p>[";
        echo substr("$book[data]",0,4);
        echo "] ";
        if(("$book[imagelink]") != '') {echo "<span class='tooltip min gray'><span class='tooltiptext'><img src=\"$book[imagelink]\"style=\"width: 120px;\" /></span>[cop]</span> ";}
        if(empty("$book[rivista]")) {echo ""."<i>$book[titolo]</i>, ";} else {echo "“$book[titolo]”, in <i>$book[rivista]</i>, ";}
        if(empty("$book[rivista]")) {echo ""."$book[edizione], "."$book[luogo] "."$book[data]";} else {echo " n. $book[num] ("."$book[data]), pp. $book[pagg].";}
        if(("$book[trad_titolo]") != '') {echo ", tr.it. <i>$book[trad_titolo]</i>, $book[trad_edizione], $book[trad_luogo] $book[trad_data]";}
        if(("$book[scheda]") != '') {echo "<details class='shadow'><summary>sintesi</summary>\n<md-block>"; include "$schedature/$book[scheda]"; echo "\n</md-block></details>";}
        echo "<span class='linkable'>";
        $keywords = "$book[keywords]";
                foreach (explode(',', $keywords) as $key) //{
                if($key != '') {echo "<span class='keyword'><a href=\"hashtag-biblio.php?tag=$key\">{$key}</a></span>";}
        $ambito = "$book[ambito]";
                foreach (explode(',', $ambito) as $keya) //{
                if($keya != '') {echo "<span class='ambito'><a href=\"hashtag-biblio.php?tag=$keya\">{$keya}</a></span>";}
        $reperibilita = "$book[reperibilita]";
                foreach (explode(',', $reperibilita) as $keyr) //{
                if($keya != '') {echo "<span class='reperibilita'><a href=\"hashtag-biblio.php?tag=$keyr\">{$keyr}</a></span>";}
        echo "</span>.</p>";
    }
    echo "
    </div>\n";
 }
?>

</div>

without trying to start an argument, i disagree :sunglasses:

it does work, and it does change the actual order of the resulting rows

however, i’m not sure the results will be correct if the join itself is faulty

i wonder if your structure is handling co-authors properly

sorry, cannot help you with the php

1 Like

Thank you. I don’t understand what you mean saying it does work: not on my PC, at least :slightly_smiling_face:.
No help required for php, of course, I add the whole code to see if anyone could see better where is the problem.
Co-authors column could be the cause, do you think? But if you could say some more about it, I would be helped to understand …

Question - are autore the last/family name and autore_nome the first name?

If so, that’s why your data is getting tossed together because you’re sorting by autore and b.data (whatever that is). If you want last name AND first name, you need to ORDER BY a.autore, a.autore_nome, b.data - your where clause does all the last/first matches, but the order by just sorts by last.

What he means (most likely) is the query is doing exactly what you’re asking it to do but the data structure does not look like it would allow you to get the results you’re looking for.

  • Your queries don’t take into account when a person is a co-author at all, which seems disingenuous,
  • The GROUP BY in your view is useless (I’m surprised you’re not getting an error) because you’re missing the co-author information in the group by, so you’re still going to get all the rows regardless.

NOTE: The more data you have under this structure, the more problematic your performance is going to be . A view that just filters out data that then rejoins the table again tells me that a normalization refactoring is needed (though I’m sure @r937 could come up with a perfectly logical scenario where your approach would be appropriate)

@SamA74 was trying to steer you down the path I’d follow. Split off all the data specific to an author into another table, then use a foreign key to join the book and author tables together.

Though if you ever have more than one co-author (example), even this structure breaks down. At which point you’d want a cross-reference table, which I’d do something like this

bibliographie_autore
----------------
bibliographie_id -- FK to bibliographie
autore_id  -- FK to autore
primary -- boolean just in case you only want to show books an author is primary on
autore_order -- a numerical field in case you don't want to sort alphabetically

Then you’d have a query like

SELECT a.autore
     , a.autore_nome
     , -- (all book information) b.imagelink, b.titolo, b.sigla as 'data SUBSTRING(data,1,4)', b.edizione, b.luogo, b.data, b.trad_titolo, b.trad_edizione, b.trad_luogo, b.trad_data, b.rivista, b.num, b.pagg, b.keywords, b.ambito, b.reperibilita, b.scheda
    , a.primary -- (use to show with title if co-author)
  FROM autori a
 INNER JOIN bibliographie_autore ba ON a.autore_id = ba.autore_id
 INNER JOIN bibliografie b  ON ba.bibliographie_id = b.bibliographie_id
 ORDER BY a.autore
     , a.autore_nom
     , b.data
2 Likes

it actually does make sense (although your point about the co-authors raising an error because of syntax is correct)

there is only one table, the books, and the view that has the GROUP BY is merely extracting the distinct authors

1 Like

yes!

Exactly!

I add that the view query (bibliografie__autori), even deleting co-autore (not so necessary) gives, on phpmyadmin, an error (even working):

ALTER ALGORITHM = UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `bibliografie__autori` AS select `bibliografia`.`bibliografie`.`autore` AS `autore`,`bibliografia`.`bibliografie`.`autore_nome` AS `autore_nome`,`bibliografia`.`bibliografie`.`id_autore` AS `id_autore` from `bibliografia`.`bibliografie` where `bibliografia`.`bibliografie`.`autore` not like '%AA.VV.%' group by `bibliografia`.`bibliografie`.`autore`,`bibliografia`.`bibliografie`.`autore_nome`

Warning in ./libraries/classes/DatabaseInterface.php#815
Undefined array key 1

Backtrace

CreateController.php#164: PhpMyAdmin\DatabaseInterface->getColumnMapFromSql(
string 'select `bibliografia`.`bibliografie`.`autore` AS `autore`,`bibliografia`.`bibliografie`.`autore_nome` AS `autore_nome`,`bibliografia`.`bibliografie`.`id_autore` AS `id_autore` from `bibliografia`.`bibliografie` where `bibliografia`.`bibliografie`.`autore` not like \'%AA.VV.%\' group by `bibliografia`.`bibliografie`.`autore`,`bibliografia`.`bibliografie`.`autore_nome`',
array,
)
Routing.php#191: PhpMyAdmin\Controllers\View\CreateController->__invoke(
,
array,
)
index.php#43: PhpMyAdmin\Routing::callControllerForRoute(
,
string '/view/create',
,
,
)

as @DaveMaxwell guess

If I understand your words, @DaveMaxwell the only right solution would be to add author ids to every row of bibliografie.
But adding ids to every book (because this is this what I should do, isn’t it?) would be a very long and “painful” work :hot_face: (I should add to every book the id of its author).

EDIT

Moreover I tried to modify the view (bibliografie__autori) in this way:

select `bibliografia`.`bibliografie`.`autore` AS `autore`,`bibliografia`.`bibliografie`.`autore_nome` AS `autore_nome`,`bibliografia`.`bibliografie`.`id_autore` AS `id_autore` from `bibliografia`.`bibliografie` where `bibliografia`.`bibliografie`.`autore`  not like '%AA.VV.%' 
group by `bibliografia`.`bibliografie`.`id_autore`

and in the php

    SELECT a.autore, a.autore_nome, b.autore, b.autore_nome, a.id_autore, b.id_autore, b.imagelink, b.titolo, b.sigla as 'data SUBSTRING(data,1,4)', b.edizione, b.luogo, b.data, b.trad_titolo, b.trad_edizione, b.trad_luogo, b.trad_data, b.rivista, b.num, b.pagg, b.keywords, b.ambito, b.reperibilita, b.scheda
    FROM bibliografie__autori a
    INNER JOIN bibliografie b
    WHERE a.id_autore = b.id_autore
    ORDER BY a.autore, b.data

I have only books with author’s ids (as expected), but (not expected) I have two authors (Biffi Inos and Bifii Giacomo, with two different ids) still merged into one:

Under Biffi Inos there are books of Biffi Giacomo

this is what you wrote (inserting line breaks and spacing to make it human-readable) –

select `bibliografia`.`bibliografie`.`autore` AS `autore`
     , `bibliografia`.`bibliografie`.`autore_nome` AS `autore_nome`
     , `bibliografia`.`bibliografie`.`id_autore` AS `id_autore` 
  from `bibliografia`.`bibliografie` 
 where `bibliografia`.`bibliografie`.`autore` not like '%AA.VV.%' 
group 
    by `bibliografia`.`bibliografie`.`autore`
     , `bibliografia`.`bibliografie`.`autore_nome`

please replace that with this –

SELECT DISTINCT
       autore
     , autore_nome
  FROM bibliografia.bibliografie 
 WHERE autore NOT LIKE '%AA.VV.%' 

with this as the view, go back to joining to the books based on names

in fact, please explain why you actually need this view at all!!!

can’t you just pull your data from the books alone?

Done! It works as query, but in php the result is the same :frowning:

Sorry, I’m not so expert of sql: the reason is that I have automatically added new authors. If I had a table of authors I should add anew every time I add a new author.

So, working only on the table bibliografie (without a view or another table) I could get the same result (books “under” its author)?

EDIT

This code works! :grinning:

    $sql = $pdo->query("
    SELECT
    a.autore,
    a.autore_nome,
    b.autore,
    b.autore_nome,
    a.id_autore,
    b.id_autore,
    b.imagelink,
    b.titolo,
    SUBSTRING(b.data, 1, 4) AS data_anno,
    b.edizione,
    b.luogo,
    b.data,
    b.trad_titolo,
    b.trad_edizione,
    b.trad_luogo,
    b.trad_data,
    b.rivista,
    b.num,
    b.pagg,
    b.keywords,
    b.ambito,
    b.reperibilita,
    b.scheda
    FROM
    bibliografie__autori a
    INNER JOIN
    bibliografie b
    ON
    a.autore = b.autore
    AND a.autore_nome = b.autore_nome
    ORDER BY
    a.autore,
    a.autore_nome,
    b.data
    ");

    // Recupera i dati
    $results = $sql->fetchAll(PDO::FETCH_ASSOC);

    // Raggruppa i risultati per autore e nome
    $autori = [];
    foreach ($results as $row) {
            $key = $row['autore'] . ', ' . $row['autore_nome'];
            if (!isset($autori[$key])) {
                    $autori[$key] = [];
            }
            $autori[$key][] = $row;
    }

    // Output dei dati
    $schedature = "biblio/schedature/";
    foreach ($autori as $autore => $books) {
            echo "<div class='left'>";
            echo "<h2>" . $books[0]['autore'] . ", " . $books[0]['autore_nome'] . "</h2>\n";
            foreach ($books as $book) {
                    $imageastooltip = "<span class='tooltip'><span class='tooltiptext'><img src='$book[imagelink]' /></span><i>$book[titolo]</i></span>";
                    echo "<p>[";
                    echo substr($book['data'], 0, 4);
                    echo "] ";
                    if (!empty($book['imagelink'])) {
                            echo "<span class='tooltip min gray'><span class='tooltiptext'><img src=\"$book[imagelink]\" style=\"width: 120px;\" /></span>[cop]</span> ";
                    }
                    if (empty($book['rivista'])) {
                            echo "<i>$book[titolo]</i>, ";
                    } else {
                            echo "“$book[titolo]”, in <i>$book[rivista]</i>, ";
                    }
                    if (empty($book['rivista'])) {
                            echo "$book[edizione], $book[luogo] $book[data]";
                    } else {
                            echo "n. $book[num] ($book[data]), pp. $book[pagg].";
                    }
                    if (!empty($book['trad_titolo'])) {
                            echo ", tr.it. <i>$book[trad_titolo]</i>, $book[trad_edizione], $book[trad_luogo] $book[trad_data]";
                    }
                    if (!empty($book['scheda'])) {
                            echo "<details class='shadow'><summary>sintesi</summary>\n<md-block>";
                            include "$schedature/$book[scheda]";
                            echo "\n</md-block></details>";
                    }
                    echo "<span class='linkable'>";
                    $keywords = $book['keywords'];
                    foreach (explode(',', $keywords) as $key) {
                            if (!empty($key)) {Francesco
                                    echo "<span class='keyword'><a href=\"hashtag-biblio.php?tag=$key\">{$key}</a></span>";
                            }
                    }
                    $ambito = $book['ambito'];
                    foreach (explode(',', $ambito) as $keya) {
                            if (!empty($keya)) {
                                    echo "<span class='ambito'><a href=\"hashtag-biblio.php?tag=$keya\">{$keya}</a></span>";
                            }
                    }
                    $reperibilita = $book['reperibilita'];
                    foreach (explode(',', $reperibilita) as $keyr) {
                            if (!empty($keyr)) {
                                    echo "<span class='reperibilita'><a href=\"hashtag-biblio.php?tag=$keyr\">{$keyr}</a></span>";
                            }
                    }
                    echo "</span>.</p>";
            }
            echo "</div>\n";
    }
    ?>

Go figure. You’re now sorting by last name then first name then book. :wink:

which was my idea a long time ago

So it was, all the way back in post 8

Thank you all. The important, for me, is that the code works :slightly_smiling_face: